Improving predictive power through deep learning analysis of K-12 online student behaviors and discussion board content

2020 ◽  
Vol 48 (4) ◽  
pp. 199-212 ◽  
Author(s):  
Jui-Long Hung ◽  
Kerry Rice ◽  
Jennifer Kepka ◽  
Juan Yang

Purpose For studies in educational data mining or learning Analytics, the prediction of student’s performance or early warning is one of the most popular research topics. However, research gaps indicate a paucity of research using machine learning and deep learning (DL) models in predictive analytics that include both behaviors and text analysis. Design/methodology/approach This study combined behavioral data and discussion board content to construct early warning models with machine learning and DL algorithms. In total, 680 course sections, 12,869 students and 14,951,368 logs were collected from a K-12 virtual school in the USA. Three rounds of experiments were conducted to demonstrate the effectiveness of the proposed approach. Findings The DL model performed better than machine learning models and was able to capture 51% of at-risk students in the eighth week with 86.8% overall accuracy. The combination of behavioral and textual data further improved the model’s performance in both recall and accuracy rates. The total word count is a more general indicator than the textual content feature. Successful students showed more words in analytic, and at-risk students showed more words in authentic when text was imported into a linguistic function word analysis tool. The balanced threshold was 0.315, which can capture up to 59% of at-risk students. Originality/value The results of this exploratory study indicate that the use of student behaviors and text in a DL approach may improve the predictive power of identifying at-risk learners early enough in the learning process to allow for interventions that can change the course of their trajectory.

2020 ◽  
Author(s):  
Ryan Shaun Baker ◽  
Andy Berning ◽  
Sujith M. Gowda

At-risk prediction and early warning initiatives have become a core part of contemporary practice in American high schools, with the goal of identifying students at-risk of poorer outcomes, determining which factors are associated with these risks, and developing interventions to support at-risk students’ individual needs. However, efforts along these lines have typically ignored whether a student is military-connected or not. Given the many differences between military-connected students and other students, we investigate whether models developed for non-military-connected students still function effectively for military-connected students, studying the specific cases of graduation prediction and SAT score prediction. We then identify which variables are highly different in their connections to student outcomes, between populations.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Deepa S.N.

Purpose Limitations encountered with the models developed in the previous studies had occurrences of global minima; due to which this study developed a new intelligent ubiquitous computational model that learns with gradient descent learning rule and operates with auto-encoders and decoders to attain better energy optimization. Ubiquitous machine learning computational model process performs training in a better way than regular supervised learning or unsupervised learning computational models with deep learning techniques, resulting in better learning and optimization for the considered problem domain of cloud-based internet-of-things (IOTs). This study aims to improve the network quality and improve the data accuracy rate during the network transmission process using the developed ubiquitous deep learning computational model. Design/methodology/approach In this research study, a novel intelligent ubiquitous machine learning computational model is designed and modelled to maintain the optimal energy level of cloud IOTs in sensor network domains. A new intelligent ubiquitous computational model that learns with gradient descent learning rule and operates with auto-encoders and decoders to attain better energy optimization is developed. A new unified deterministic sine-cosine algorithm has been developed in this study for parameter optimization of weight factors in the ubiquitous machine learning model. Findings The newly developed ubiquitous model is used for finding network energy and performing its optimization in the considered sensor network model. At the time of progressive simulation, residual energy, network overhead, end-to-end delay, network lifetime and a number of live nodes are evaluated. It is elucidated from the results attained, that the ubiquitous deep learning model resulted in better metrics based on its appropriate cluster selection and minimized route selection mechanism. Research limitations/implications In this research study, a novel ubiquitous computing model derived from a new optimization algorithm called a unified deterministic sine-cosine algorithm and deep learning technique was derived and applied for maintaining the optimal energy level of cloud IOTs in sensor networks. The deterministic levy flight concept is applied for developing the new optimization technique and this tends to determine the parametric weight values for the deep learning model. The ubiquitous deep learning model is designed with auto-encoders and decoders and their corresponding layers weights are determined for optimal values with the optimization algorithm. The modelled ubiquitous deep learning approach was applied in this study to determine the network energy consumption rate and thereby optimize the energy level by increasing the lifetime of the sensor network model considered. For all the considered network metrics, the ubiquitous computing model has proved to be effective and versatile than previous approaches from early research studies. Practical implications The developed ubiquitous computing model with deep learning techniques can be applied for any type of cloud-assisted IOTs in respect of wireless sensor networks, ad hoc networks, radio access technology networks, heterogeneous networks, etc. Practically, the developed model facilitates computing the optimal energy level of the cloud IOTs for any considered network models and this helps in maintaining a better network lifetime and reducing the end-to-end delay of the networks. Social implications The social implication of the proposed research study is that it helps in reducing energy consumption and increases the network lifetime of the cloud IOT based sensor network models. This approach helps the people in large to have a better transmission rate with minimized energy consumption and also reduces the delay in transmission. Originality/value In this research study, the network optimization of cloud-assisted IOTs of sensor network models is modelled and analysed using machine learning models as a kind of ubiquitous computing system. Ubiquitous computing models with machine learning techniques develop intelligent systems and enhances the users to make better and faster decisions. In the communication domain, the use of predictive and optimization models created with machine learning accelerates new ways to determine solutions to problems. Considering the importance of learning techniques, the ubiquitous computing model is designed based on a deep learning strategy and the learning mechanism adapts itself to attain a better network optimization model.


2016 ◽  
Vol 23 (2) ◽  
pp. 124 ◽  
Author(s):  
Douglas Detoni ◽  
Cristian Cechinel ◽  
Ricardo Araujo Matsumura ◽  
Daniela Francisco Brauner

Student dropout is one of the main problems faced by distance learning courses. One of the major challenges for researchers is to develop methods to predict the behavior of students so that teachers and tutors are able to identify at-risk students as early as possible and provide assistance before they drop out or fail in their courses. Machine Learning models have been used to predict or classify students in these settings. However, while these models have shown promising results in several settings, they usually attain these results using attributes that are not immediately transferable to other courses or platforms. In this paper, we provide a methodology to classify students using only interaction counts from each student. We evaluate this methodology on a data set from two majors based on the Moodle platform. We run experiments consisting of training and evaluating three machine learning models (Support Vector Machines, Naive Bayes and Adaboost decision trees) under different scenarios. We provide evidences that patterns from interaction counts can provide useful information for classifying at-risk students. This classification allows the customization of the activities presented to at-risk students (automatically or through tutors) as an attempt to avoid students drop out.


2020 ◽  
Vol 27 (8) ◽  
pp. 1891-1912
Author(s):  
Hengqin Wu ◽  
Geoffrey Shen ◽  
Xue Lin ◽  
Minglei Li ◽  
Boyu Zhang ◽  
...  

PurposeThis study proposes an approach to solve the fundamental problem in using query-based methods (i.e. searching engines and patent retrieval tools) to screen patents of information and communication technology in construction (ICTC). The fundamental problem is that ICTC incorporates various techniques and thus cannot be simply represented by man-made queries. To investigate this concern, this study develops a binary classifier by utilizing deep learning and NLP techniques to automatically identify whether a patent is relevant to ICTC, thus accurately screening a corpus of ICTC patents.Design/methodology/approachThis study employs NLP techniques to convert the textual data of patents into numerical vectors. Then, a supervised deep learning model is developed to learn the relations between the input vectors and outputs.FindingsThe validation results indicate that (1) the proposed approach has a better performance in screening ICTC patents than traditional machine learning methods; (2) besides the United States Patent and Trademark Office (USPTO) that provides structured and well-written patents, the approach could also accurately screen patents form Derwent Innovations Index (DIX), in which patents are written in different genres.Practical implicationsThis study contributes a specific collection for ICTC patents, which is not provided by the patent offices.Social implicationsThe proposed approach contributes an alternative manner in gathering a corpus of patents for domains like ICTC that neither exists as a searchable classification in patent offices, nor is accurately represented by man-made queries.Originality/valueA deep learning model with two layers of neurons is developed to learn the non-linear relations between the input features and outputs providing better performance than traditional machine learning models. This study uses advanced NLP techniques lemmatization and part-of-speech POS to process textual data of ICTC patents. This study contributes specific collection for ICTC patents which is not provided by the patent offices.


2020 ◽  
Vol 10 (13) ◽  
pp. 4427 ◽  
Author(s):  
David Bañeres ◽  
M. Elena Rodríguez ◽  
Ana Elena Guerrero-Roldán ◽  
Abdulkadir Karadeniz

Artificial intelligence has impacted education in recent years. Datafication of education has allowed developing automated methods to detect patterns in extensive collections of educational data to estimate unknown information and behavior about the students. This research has focused on finding accurate predictive models to identify at-risk students. This challenge may reduce the students’ risk of failure or disengage by decreasing the time lag between identification and the real at-risk state. The contribution of this paper is threefold. First, an in-depth analysis of a predictive model to detect at-risk students is performed. This model has been tested using data available in an institutional data mart where curated data from six semesters are available, and a method to obtain the best classifier and training set is proposed. Second, a method to determine a threshold for evaluating the quality of the predictive model is established. Third, an early warning system has been developed and tested in a real educational setting being accurate and useful for its purpose to detect at-risk students in online higher education. The stakeholders (i.e., students and teachers) can analyze the information through different dashboards, and teachers can also send early feedback as an intervention mechanism to mitigate at-risk situations. The system has been evaluated on two undergraduate courses where results shown a high accuracy to correctly detect at-risk students.


Kybernetes ◽  
2017 ◽  
Vol 46 (4) ◽  
pp. 693-705 ◽  
Author(s):  
Yasser F. Hassan

Purpose This paper aims to utilize machine learning and soft computing to propose a new method of rough sets using deep learning architecture for many real-world applications. Design/methodology/approach The objective of this work is to propose a model for deep rough set theory that uses more than decision table and approximating these tables to a classification system, i.e. the paper propose a novel framework of deep learning based on multi-decision tables. Findings The paper tries to coordinate the local properties of individual decision table to provide an appropriate global decision from the system. Research limitations/implications The rough set learning assumes the existence of a single decision table, whereas real-world decision problem implies several decisions with several different decision tables. The new proposed model can handle multi-decision tables. Practical implications The proposed classification model is implemented on social networks with preferred features which are freely distribute as social entities with accuracy around 91 per cent. Social implications The deep learning using rough sets theory simulate the way of brain thinking and can solve the problem of existence of different information about same problem in different decision systems Originality/value This paper utilizes machine learning and soft computing to propose a new method of rough sets using deep learning architecture for many real-world applications.


2021 ◽  
Author(s):  
Cameron I. Cooper ◽  
Kamea J. Cooper

Abstract Nationally, more than one-third of students enrolling in introductory computer science programming courses (CS101) do not succeed. To improve student success rates, this research team used supervised machine learning to identify students who are “at-risk” of not succeeding in CS101 at a two-year public college. The resultant predictive model accurately identifies \(\approx\)99% of “at-risk” students in an out-of-sample test data set. The programming instructor piloted the use of the model’s predictive factors as early alert triggers to intervene with individualized outreach and support across three course sections of CS101 in fall 2020. The outcome of this pilot study was a 23% increase in student success and a 7.3 percentage point decrease in DFW rate. More importantly, this study identified academic, early alert triggers for CS101. Specifically, the first two graded programs are of paramount importance for student success in the course.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Paolo Dello Vicario ◽  
Valentina Tortolini

Purpose The purpose of this paper is to define a methodology to analyze links between programming topics and libraries starting from GitHub data. Design/methodology/approach This paper developed an analysis over machine learning repositories on GitHub, finding communities of repositories and studying the anatomy of collaboration around a popular topic such as machine learning. Findings This analysis indicates the significant importance of programming languages and technologies such as Python and Jupyter Notebook. It also shows the rise of deep learning and of specific libraries such as Tensorflow from Google. Originality/value There exists no survey or analysis based on how developers influence each other for specific topics. Other researchers focused their analysis on the collaborative structure and social impact instead of topic impact. Using this methodology to analyze programming topics is important not just for machine learning but also for other topics.


2017 ◽  
Vol 1 (1) ◽  
pp. 8-18
Author(s):  
Adam Christian Haupt ◽  
Jonathan Alt ◽  
Samuel Buttrey

Purpose This paper aims to use a data-driven approach to identify the factors and metrics that provide the best indicators of academic attrition in the Korean language program at the Defense Language Institute Foreign Language Center. Design methodology approach This research develops logistic regression models to aid in the identification of at-risk students in the Defense Language Institute’s Korean language school. Findings The results from this research demonstrates that this methodology can detect significant factors and metrics that identify students at-risk. Additionally, this research shows that school policy changes can be detected using logistic regression models and stepwise regression. Originality value This research represents a real-world application of logistic regression modeling methods applied to the problem of identifying at-risk students for the purpose of academic intervention or other negative outcomes. By using logistic regression, the authors are able to gain a greater understanding of the problem and identify statistically significant predictors of student attrition that they believe can be converted into meaningful policy change.


Sign in / Sign up

Export Citation Format

Share Document