Dropout Prediction in MOOCs: Using Deep Learning for Personalized Intervention

2018 ◽  
Vol 57 (3) ◽  
pp. 547-570 ◽  
Author(s):  
Wanli Xing ◽  
Dongping Du

Massive open online courses (MOOCs) show great potential to transform traditional education through the Internet. However, the high attrition rates in MOOCs have often been cited as a scale-efficacy tradeoff. Traditional educational approaches are usually unable to identify such large-scale number of at-risk students in danger of dropping out in time to support effective intervention design. While building dropout prediction models using learning analytics are promising in informing intervention design for these at-risk students, results of the current prediction model construction methods do not enable personalized intervention for these students. In this study, we take an initial step to optimize the dropout prediction model performance toward intervention personalization for at-risk students in MOOCs. Specifically, based on a temporal prediction mechanism, this study proposes to use the deep learning algorithm to construct the dropout prediction model and further produce the predicted individual student dropout probability. By taking advantage of the power of deep learning, this approach not only constructs more accurate dropout prediction models compared with baseline algorithms but also comes up with an approach to personalize and prioritize intervention for at-risk students in MOOCs through using individual drop out probabilities. The findings from this study and implications are then discussed.

Author(s):  
Mu Lin Wong ◽  
Senthil S.

Academic Performance Prediction models mustn't be accurate only, but timely too, to identify at-risk students at the earliest to provide remedy. Heart rate data of 50 students in 3 main courses are collected, processed, and analyzed to distinguish the difference between excellent students and at-risk students. Three of the 12 heart rate attributes were chosen to calculate the threshold values, which are used to predict at-risk students. Half of the at-risk students were identified after week 5. Later, the datasets were rebalanced. Using four Data Mining classifiers, six attributes were identified to be the best attributes for prediction model development. The datasets were then dimensionally reduced. Applying classification, half of the at-risk students were identified earliest around week 5 of the 12-week semester. J48 is the most robust classifier, compared to JRip, Multi-Level-Perceptron, and RandomForest, making accurate prediction on at-risk students earlier most of the time.


2020 ◽  
Vol 2 (3) ◽  
pp. 140-152 ◽  
Author(s):  
Vignesh Muthukumar ◽  
Dr. Bhalaji N.

Massive Open Online Courses (MOOCs) has seen a dramatic increase of participants in the last few years with an exponential growth of internet users all around the world. MOOC allows users to attend lectures of top professors from world class universities. Despite flexible accessibility, the common trend observed in each course is that the number of active participants appears to decrease exponentially as the week’s progress. The structure and nature of the courses affects the number of active participants directly. A comprehensive review of the available literature shows that very little intensive work was done using the pattern of user interaction with courses in the field of MOOC data analysis. In this paper, we take an initial step to use the deep learning algorithm to construct the dropout prediction model and produce the predicted individual student dropout probability. Additional improvements are made to optimize the performance of the dropout prediction model and provide the course providers with appropriate interventions based on a temporal prediction mechanism. Our Exploratory Data Analysis demonstrates that there is a strong correlation between click stream actions and successful learner outcomes. Among other features, the deep learning algorithm takes the weekly history of student data into account and thus is able to notice changes in student behaviour over time.


2020 ◽  
Vol 26 (33) ◽  
pp. 4195-4205
Author(s):  
Xiaoyu Ding ◽  
Chen Cui ◽  
Dingyan Wang ◽  
Jihui Zhao ◽  
Mingyue Zheng ◽  
...  

Background: Enhancing a compound’s biological activity is the central task for lead optimization in small molecules drug discovery. However, it is laborious to perform many iterative rounds of compound synthesis and bioactivity tests. To address the issue, it is highly demanding to develop high quality in silico bioactivity prediction approaches, to prioritize such more active compound derivatives and reduce the trial-and-error process. Methods: Two kinds of bioactivity prediction models based on a large-scale structure-activity relationship (SAR) database were constructed. The first one is based on the similarity of substituents and realized by matched molecular pair analysis, including SA, SA_BR, SR, and SR_BR. The second one is based on SAR transferability and realized by matched molecular series analysis, including Single MMS pair, Full MMS series, and Multi single MMS pairs. Moreover, we also defined the application domain of models by using the distance-based threshold. Results: Among seven individual models, Multi single MMS pairs bioactivity prediction model showed the best performance (R2 = 0.828, MAE = 0.406, RMSE = 0.591), and the baseline model (SA) produced the most lower prediction accuracy (R2 = 0.798, MAE = 0.446, RMSE = 0.637). The predictive accuracy could further be improved by consensus modeling (R2 = 0.842, MAE = 0.397 and RMSE = 0.563). Conclusion: An accurate prediction model for bioactivity was built with a consensus method, which was superior to all individual models. Our model should be a valuable tool for lead optimization.


2020 ◽  
Author(s):  
Young Min Park ◽  
Byung-Joo Lee

Abstract Background: This study analyzed the prognostic significance of nodal factors, including the number of metastatic LNs and LNR, in patients with PTC, and attempted to construct a disease recurrence prediction model using machine learning techniques.Methods: We retrospectively analyzed clinico-pathologic data from 1040 patients diagnosed with papillary thyroid cancer between 2003 and 2009. Results: We analyzed clinico-pathologic factors related to recurrence through logistic regression analysis. Among the factors that we included, only sex and tumor size were significantly correlated with disease recurrence. Parameters such as age, sex, tumor size, tumor multiplicity, ETE, ENE, pT, pN, ipsilateral central LN metastasis, contralateral central LNs metastasis, number of metastatic LNs, and LNR were input for construction of a machine learning prediction model. The performance of five machine learning models related to recurrence prediction was compared based on accuracy. The Decision Tree model showed the best accuracy at 95%, and the lightGBM and stacking model together showed 93% accuracy. Conclusions: We confirmed that all machine learning prediction models showed an accuracy of 90% or more for predicting disease recurrence in PTC. Large-scale multicenter clinical studies should be performed to improve the performance of our prediction models and verify their clinical effectiveness.


2019 ◽  
Vol 62 (3) ◽  
pp. 987-1003 ◽  
Author(s):  
Yan Chen ◽  
Qinghua Zheng ◽  
Shuguang Ji ◽  
Feng Tian ◽  
Haiping Zhu ◽  
...  

2021 ◽  
Vol 6 (1) ◽  
pp. e003451
Author(s):  
Arjun Chandna ◽  
Rainer Tan ◽  
Michael Carter ◽  
Ann Van Den Bruel ◽  
Jan Verbakel ◽  
...  

IntroductionEarly identification of children at risk of severe febrile illness can optimise referral, admission and treatment decisions, particularly in resource-limited settings. We aimed to identify prognostic clinical and laboratory factors that predict progression to severe disease in febrile children presenting from the community.MethodsWe systematically reviewed publications retrieved from MEDLINE, Web of Science and Embase between 31 May 1999 and 30 April 2020, supplemented by hand search of reference lists and consultation with an expert Technical Advisory Panel. Studies evaluating prognostic factors or clinical prediction models in children presenting from the community with febrile illnesses were eligible. The primary outcome was any objective measure of disease severity ascertained within 30 days of enrolment. We calculated unadjusted likelihood ratios (LRs) for comparison of prognostic factors, and compared clinical prediction models using the area under the receiver operating characteristic curves (AUROCs). Risk of bias and applicability of studies were assessed using the Prediction Model Risk of Bias Assessment Tool and the Quality In Prognosis Studies tool.ResultsOf 5949 articles identified, 18 studies evaluating 200 prognostic factors and 25 clinical prediction models in 24 530 children were included. Heterogeneity between studies precluded formal meta-analysis. Malnutrition (positive LR range 1.56–11.13), hypoxia (2.10–8.11), altered consciousness (1.24–14.02), and markers of acidosis (1.36–7.71) and poor peripheral perfusion (1.78–17.38) were the most common predictors of severe disease. Clinical prediction model performance varied widely (AUROC range 0.49–0.97). Concerns regarding applicability were identified and most studies were at high risk of bias.ConclusionsFew studies address this important public health question. We identified prognostic factors from a wide range of geographic contexts that can help clinicians assess febrile children at risk of progressing to severe disease. Multicentre studies that include outpatients are required to explore generalisability and develop data-driven tools to support patient prioritisation and triage at the community level.PROSPERO registration numberCRD42019140542.


2021 ◽  
Vol 2021 ◽  
pp. 1-3
Author(s):  
Makoto Hashizume

Multidisciplinary computational anatomy (MCA) is a new frontier of science that provides a mathematical analysis basis for the comprehensive and useful understanding of “dynamic living human anatomy.” It defines a new mathematical modeling method for the early detection and highly intelligent diagnosis and treatment of incurable or intractable diseases. The MCA is a method of scientific research on innovative areas based on the medical images that are integrated with the information related to: (1) the spatial axis, extending from a cell size to an organ size; (2) the time series axis, extending from an embryo to post mortem body; (3) the functional axis on physiology or metabolism which is reflected in a variety of medical image modalities; and (4) the pathological axis, extending from a healthy physical condition to a diseased condition. It aims to integrate multiple prediction models such as multiscale prediction model, temporal prediction model, anatomy function prediction model, and anatomy-pathology prediction model. Artificial intelligence has been introduced to accelerate the calculation of statistic mathematical analysis. The future perspective is expected to promote the development of human resources as well as a new MCA-based scientific interdisciplinary field composed of mathematical statistics, information sciences, computing data science, robotics, and biomedical engineering and clinical applications. The MCA-based medicine might be one of the solutions to overcome the difficulties in the current medicine.


2020 ◽  
Author(s):  
Ryosuke Kojima ◽  
Shoichi Ishida ◽  
Masateru Ohta ◽  
Hiroaki Iwata ◽  
Teruki Honma ◽  
...  

<div>Deep learning is developing as an important technology to perform various tasks in cheminformatics. In particular, graph convolutional neural networks (GCNs) have been reported to perform well in many types of prediction tasks related to molecules. Although GCN exhibits considerable potential in various applications, appropriate utilization of this resource for obtaining reasonable and reliable prediction results requires thorough understanding of GCN and programming. To leverage the power of GCN to benefit various users from chemists to cheminformaticians, an open-source GCN tool, kGCN, is introduced. To support the users with various levels of programming skills, kGCN includes three interfaces: a graphical user interface (GUI) employing KNIME for users with limited programming skills such as chemists, as well as command-line and Python library interfaces for users with advanced programming skills such as cheminformaticians. To support the three steps required for building a prediction model, i.e., pre-processing, model tuning, and interpretation of results, kGCN includes functions of typical pre-processing, Bayesian optimization for automatic model tuning, and visualization of the atomic contribution to prediction for interpretation of results. kGCN supports three types of approaches, single-task, multi-task, and multimodal predictions. The prediction of compound-protein interaction for four matrixmetalloproteases, MMP-3, -9, -12 and -13, in the inhibition assays is performed as a representative case study using kGCN. Additionally, kGCN provides the visualization of atomic contributions to the prediction. Such visualization is useful for the validation of the prediction models and the design of molecules based on the prediction model, realizing “explainable AI” for understanding the factors affecting AI prediction. kGCN is available at https://github.com/clinfo/kGCN.</div>


2019 ◽  
Vol 8 (3) ◽  
pp. 5916-5920

Timeliness was a missing factor in many studies on Academic Performance Prediction to identify at-risk students. This study embarked on a search to evaluate the feasibility of predicting students’ performance based on heart rate data collected during classes. This dimension of data was collected in the first four weeks after semester commencement to validate accurate prediction that will enable educationists to introduce remedial intervention to at-risk students. Another aim of this study is to determine the best threshold values for the different types of heart rate fluctuations that can be used in predicting academic achievements. The threshold values were tested further to verify whether the prediction model for individual course or combined courses was more accurate. Results revealed that heart rate data alone can achieve a maximum prediction accuracy of 88% and recall of 100%. Threshold values calculated in derived heart rate fluctuation types produces the best results. Prediction models for individual courses outperform the model using average threshold values of all courses.


Sign in / Sign up

Export Citation Format

Share Document