scholarly journals Distributing Epistemic Functions and Tasks - A Framework for Augmenting Human Analytic Power With Machine Learning in Science Education Research

2022 ◽  
Author(s):  
Marcus Kubsch ◽  
Christina Krist ◽  
Joshua Rosenberg

Machine learning has become commonplace in educational research and science education research, especially to support assessment efforts. Such applications of machine learning have shown their promise in replicating and scaling human-driven codes of students’ work. Despite this promise, we and other scholars argue that machine learning has not achieved its transformational potential. We argue that this is because our field is currently lacking frameworks for supporting creative, principled, and critical endeavors to use machine learning in science education research. To offer considerations for science education researchers’ use of ML, we present a framework, Distributing Epistemic Functions and Tasks (DEFT), that highlights the functions and tasks that pertain to generating knowledge that can be carried out by either trained researchers or machine learning algorithms. Such considerations are critical decisions that should occur alongside those about, for instance, the type of data or algorithm used. We apply this framework to two cases, one that exemplifies the cutting-edge use of machine learning in science education research and another that offers a wholly different means of using machine learning and human-driven inquiry together. We conclude with strategies for researchers to adopt machine learning and call for the field to rethink how we prepare science education researchers in an era of great advances in computational power and access to machine learning methods.

2019 ◽  
Vol 24 (34) ◽  
pp. 3998-4006
Author(s):  
Shijie Fan ◽  
Yu Chen ◽  
Cheng Luo ◽  
Fanwang Meng

Background: On a tide of big data, machine learning is coming to its day. Referring to huge amounts of epigenetic data coming from biological experiments and clinic, machine learning can help in detecting epigenetic features in genome, finding correlations between phenotypes and modifications in histone or genes, accelerating the screen of lead compounds targeting epigenetics diseases and many other aspects around the study on epigenetics, which consequently realizes the hope of precision medicine. Methods: In this minireview, we will focus on reviewing the fundamentals and applications of machine learning methods which are regularly used in epigenetics filed and explain their features. Their advantages and disadvantages will also be discussed. Results: Machine learning algorithms have accelerated studies in precision medicine targeting epigenetics diseases. Conclusion: In order to make full use of machine learning algorithms, one should get familiar with the pros and cons of them, which will benefit from big data by choosing the most suitable method(s).


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Alan Brnabic ◽  
Lisa M. Hess

Abstract Background Machine learning is a broad term encompassing a number of methods that allow the investigator to learn from the data. These methods may permit large real-world databases to be more rapidly translated to applications to inform patient-provider decision making. Methods This systematic literature review was conducted to identify published observational research of employed machine learning to inform decision making at the patient-provider level. The search strategy was implemented and studies meeting eligibility criteria were evaluated by two independent reviewers. Relevant data related to study design, statistical methods and strengths and limitations were identified; study quality was assessed using a modified version of the Luo checklist. Results A total of 34 publications from January 2014 to September 2020 were identified and evaluated for this review. There were diverse methods, statistical packages and approaches used across identified studies. The most common methods included decision tree and random forest approaches. Most studies applied internal validation but only two conducted external validation. Most studies utilized one algorithm, and only eight studies applied multiple machine learning algorithms to the data. Seven items on the Luo checklist failed to be met by more than 50% of published studies. Conclusions A wide variety of approaches, algorithms, statistical software, and validation strategies were employed in the application of machine learning methods to inform patient-provider decision making. There is a need to ensure that multiple machine learning approaches are used, the model selection strategy is clearly defined, and both internal and external validation are necessary to be sure that decisions for patient care are being made with the highest quality evidence. Future work should routinely employ ensemble methods incorporating multiple machine learning algorithms.


Risks ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 4 ◽  
Author(s):  
Christopher Blier-Wong ◽  
Hélène Cossette ◽  
Luc Lamontagne ◽  
Etienne Marceau

In the past 25 years, computer scientists and statisticians developed machine learning algorithms capable of modeling highly nonlinear transformations and interactions of input features. While actuaries use GLMs frequently in practice, only in the past few years have they begun studying these newer algorithms to tackle insurance-related tasks. In this work, we aim to review the applications of machine learning to the actuarial science field and present the current state of the art in ratemaking and reserving. We first give an overview of neural networks, then briefly outline applications of machine learning algorithms in actuarial science tasks. Finally, we summarize the future trends of machine learning for the insurance industry.


Author(s):  
Qifang Bi ◽  
Katherine E Goodman ◽  
Joshua Kaminsky ◽  
Justin Lessler

Abstract Machine learning is a branch of computer science that has the potential to transform epidemiologic sciences. Amid a growing focus on “Big Data,” it offers epidemiologists new tools to tackle problems for which classical methods are not well-suited. In order to critically evaluate the value of integrating machine learning algorithms and existing methods, however, it is essential to address language and technical barriers between the two fields that can make it difficult for epidemiologists to read and assess machine learning studies. Here, we provide an overview of the concepts and terminology used in machine learning literature, which encompasses a diverse set of tools with goals ranging from prediction to classification to clustering. We provide a brief introduction to 5 common machine learning algorithms and 4 ensemble-based approaches. We then summarize epidemiologic applications of machine learning techniques in the published literature. We recommend approaches to incorporate machine learning in epidemiologic research and discuss opportunities and challenges for integrating machine learning and existing epidemiologic research methods.


2021 ◽  
Author(s):  
Dhairya Vyas

In terms of Machine Learning, the majority of the data can be grouped into four categories: numerical data, category data, time-series data, and text. We use different classifiers for different data properties, such as the Supervised; Unsupervised; and Reinforcement. Each Categorises has classifier we have tested almost all machine learning methods and make analysis among them.


2019 ◽  
Author(s):  
Levi John Wolf ◽  
Elijah Knaap

Dimension reduction is one of the oldest concerns in geographical analysis. Despite significant, longstanding attention in geographical problems, recent advances in non-linear techniques for dimension reduction, called manifold learning, have not been adopted in classic data-intensive geographical problems. More generally, machine learning methods for geographical problems often focus more on applying standard machine learning algorithms to geographic data, rather than applying true "spatially-correlated learning," in the words of Kohonen. As such, we suggest a general way to incentivize geographical learning in machine learning algorithms, and link it to many past methods that introduced geography into statistical techniques. We develop a specific instance of this by specifying two geographical variants of Isomap, a non-linear dimension reduction, or "manifold learning," technique. We also provide a method for assessing what is added by incorporating geography and estimate the manifold's intrinsic geographic scale. To illustrate the concepts and provide interpretable results, we conducting a dimension reduction on geographical and high-dimensional structure of social and economic data on Brooklyn, New York. Overall, this paper's main endeavor--defining and explaining a way to "geographize" many machine learning methods--yields interesting and novel results for manifold learning the estimation of intrinsic geographical scale in unsupervised learning.


Sensors ◽  
2019 ◽  
Vol 19 (7) ◽  
pp. 1521 ◽  
Author(s):  
Tomasz Rymarczyk ◽  
Grzegorz Kłosowski ◽  
Edward Kozłowski ◽  
Paweł Tchórzewski

The main goal of this work was to compare the selected machine learning methods with the classic deterministic method in the industrial field of electrical impedance tomography. The research focused on the development and comparison of algorithms and models for the analysis and reconstruction of data using electrical tomography. The novelty was the use of original machine learning algorithms. Their characteristic feature is the use of many separately trained subsystems, each of which generates a single pixel of the output image. Artificial Neural Network (ANN), LARS and Elastic net methods were used to solve the inverse problem. These algorithms have been modified by a corresponding increase in equations (multiply) for electrical impedance tomography using the finite element method grid. The Gauss-Newton method was used as a reference to machine learning methods. The algorithms were trained using learning data obtained through computer simulation based on real models. The results of the experiments showed that in the considered cases the best quality of reconstructions was achieved by ANN. At the same time, ANN was the slowest in terms of both the training process and the speed of image generation. Other machine learning methods were comparable with the deterministic Gauss-Newton method and with each other.


Hypertension ◽  
2020 ◽  
Vol 76 (2) ◽  
pp. 569-576 ◽  
Author(s):  
Kelvin K.F. Tsoi ◽  
Nicholas B. Chan ◽  
Karen K.L. Yiu ◽  
Simon K.S. Poon ◽  
Bryant Lin ◽  
...  

Visit-to-visit blood pressure variability (BPV) has been shown to be a predictor of cardiovascular disease. We aimed to classify the BPV levels using different machine learning algorithms. Visit-to-visit blood pressure readings were extracted from the SPRINT study in the United States and eHealth cohort in Hong Kong (HK cohort). Patients were clustered into low, medium, and high BPV levels with the traditional quantile clustering and 5 machine learning algorithms including K-means. Clustering methods were assessed by Stability Index. Similarities were assessed by Davies-Bouldin Index and Silhouette Index. Cox proportional hazard regression models were fitted to compare the risk of myocardial infarction, stroke, and heart failure. A total of 8133 participants had average blood pressure measurement 14.7 times in 3.28 years in SPRINT and 1094 participants who had average blood pressure measurement 165.4 times in 1.37 years in HK cohort. Quantile clustering assigned one-third participants as high BPV level, but machine learning methods only assigned 10% to 27%. Quantile clustering is the most stable method (stability index: 0.982 in the SPRINT and 0.948 in the HK cohort) with some levels of clustering similarities (Davies-Bouldin Index: 0.752 and 0.764, respectively). K-means clustering is the most stable across the machine learning algorithms (stability index: 0.975 and 0.911, respectively) with the lowest clustering similarities (Davies-Bouldin Index: 0.653 and 0.680, respectively). One out of 7 in the population was classified with high BPV level, who showed to have higher risk of stroke and heart failure. Machine learning methods can improve BPV classification for better prediction of cardiovascular diseases.


Sign in / Sign up

Export Citation Format

Share Document