scholarly journals Learning analytics made in France: the METAL project

2019 ◽  
Vol 36 (4) ◽  
pp. 299-313 ◽  
Author(s):  
Armelle Brun ◽  
Geoffray Bonnin ◽  
Sylvain Castagnos ◽  
Azim Roussanaly ◽  
Anne Boyer

Purpose The purpose of this paper is to present the METAL project, a French open learning analytics (LA) project for secondary school, that aims at improving the quality of teaching. The originality of METAL is that it relies on research through exploratory activities and focuses on all the aspects of a learning analytics environment. Design/methodology/approach This work introduces the different concerns of the project: collection and storage of multi-source data owned by a variety of stakeholders, selection and promotion of standards, design of an open-source LRS, conception of dashboards with their final users, trust, usability, design of explainable multi-source data-mining algorithms. Findings All the dimensions of METAL are presented, as well as the way they are approached: data sources, data storage, through the implementation of an LRS, design of dashboards for secondary school, based on co-design sessions data mining algorithms and experiments, in line with privacy and ethics concerns. Originality/value The issue of a global dissemination of LA at an institution level or at a broader level such as a territory or a study level is still a hot topic in the literature, and is one of the focus and originality of this paper, associated with the large spectrum of different concerns.

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Thiago Cesar de Oliveira ◽  
Lúcio de Medeiros ◽  
Daniel Henrique Marco Detzel

Purpose Real estate appraisals are becoming an increasingly important means of backing up financial operations based on the values of these kinds of assets. However, in very large databases, there is a reduction in the predictive capacity when traditional methods, such as multiple linear regression (MLR), are used. This paper aims to determine whether in these cases the application of data mining algorithms can achieve superior statistical results. First, real estate appraisal databases from five towns and cities in the State of Paraná, Brazil, were obtained from Caixa Econômica Federal bank. Design/methodology/approach After initial validations, additional databases were generated with both real, transformed and nominal values, in clean and raw data. Each was assisted by the application of a wide range of data mining algorithms (multilayer perceptron, support vector regression, K-star, M5Rules and random forest), either isolated or combined (regression by discretization – logistic, bagging and stacking), with the use of 10-fold cross-validation in Weka software. Findings The results showed more varied incremental statistical results with the use of algorithms than those obtained by MLR, especially when combined algorithms were used. The largest increments were obtained in databases with a large amount of data and in those where minor initial data cleaning was carried out. The paper also conducts a further analysis, including an algorithmic ranking based on the number of significant results obtained. Originality/value The authors did not find similar studies or research studies conducted in Brazil.


Author(s):  
Qin Ding

With the growing usage of XML data for data storage and exchange, there is an imminent need to develop efficient algorithms to perform data mining on semistructured XML data. Mining on XML data is much more difficult than mining on relational data because of the complexity of structure in XML data. A naïve approach to mining on XML data is to first convert XML data into relational format. However the structure information may be lost during the conversion. It is desired to develop efficient and effective data mining algorithms that can be directly applied on XML data.


2014 ◽  
Vol 644-650 ◽  
pp. 1702-1705 ◽  
Author(s):  
Jin Hai Zhang

Beacause internet data has a massive, diverse, heterogeneous, dynamic features, using traditional databases to analyze these data, data storage and processing efficiencies already can not meet the requirements. Utilizing leading-edge distributed computing technology to solve traditional data mining scenarios in lack of data mining of massive data improved data mining algorithm of lot OK Hadoop distributed computing platform, which later on other data mining algorithms using Hadoop to reference while using rich data mining algorithms can be found there is more value in your data.


2020 ◽  
Vol 166 ◽  
pp. 05007
Author(s):  
Vitalii Levkivskyi ◽  
Nadiia Lobanchykova ◽  
Dmytro Marchuk

The article explores data mining algorithms, which based on rules and calculations, that allow us to create a model that analyzes the data provided by searching for specific patterns and trends. The purpose of this work is to analyze correlation-regression algorithms on a statistical dataset of chronic diseases. Data mining allows building many models, multiple algorithms can be used within a single solution. The article explores the algorithms of clustering, correlation analysis, Naive Bayes algorithm for obtaining different views of data. Since diabetes is one of the most dangerous chronic diseases, the pathogenesis of which is a lack of insulin in the human body, which causes metabolic disorders and pathological changes in various organs and tissues. As a result, it leads to disability of all functional systems of the body. It was decided to investigate the data related to this disease. Also, the quality of the developed methods of information retrieval from the dataset was evaluated and the most informative features were identified. The developed methods were implemented in the system of intellectual data processing. Past studies show promise of using data mining methods to improve the quality of patient care.


2017 ◽  
Vol 1 (3) ◽  
pp. 153
Author(s):  
Gathut Cakra Sutradana ◽  
M Didik Rohmad Wahyudi

ABSTRACTThe accuracy of a long study of college students at a university becomes very important in demonstrating the quality of the learning process in college. There are many things that affect a student's study time. Data Mining offers a way to know the various aspects that may affect a student's study time. To know the various aspects that influence the duration of the study based on data graduation students are available, then the implementation of a Data Mining algorithms can be used. In this study, Data Mining algorithms used to find aspects that affect student study duration is Apriori algorithm.Keywords: graduation analysis, long studying, data mining, apriori algorithms  Ketepatan lama studi mahasiswa pada suatu perguruan tinggi menjadi hal yang sangat penting dalam menunjukkan kualitas proses pembelajaran di perguruan tinggi. Ada banyak hal yang mempengaruhi lama studi mahasiswa. Data Mining menawarkan suatu cara untuk mengetahui berbagai aspek yang dapat berpengaruh terhadap lama studi mahasiswa. Untuk mengetahui berbagai aspek yang mempengaruhi lama studi mahasiswa berdasarkan data kelulusan yang tersedia, maka implementasi suatu algoritma Data Mining dapat dipergunakan. Dalam penelitian ini, algoritma Data Mining yang dipergunakan untuk menemukan aspek yang mempengaruhi lama studi mahasiswa adalah algoritma Apriori.Katakunci : analisis kelulusan, lama studi, data mining, algoritma apriori


2013 ◽  
Vol 475-476 ◽  
pp. 1008-1012
Author(s):  
Li Hua Yang ◽  
Gui Lin Li ◽  
Shao Bin Zhou ◽  
Ming Hong Liao

The outlier detection is to select uncommon data from a data set, which can significantly improve the quality of results for the data mining algorithms. A typical feature of the outliers is that they are always far away from a majority of data in the data set. In this paper, we present a graph-based outlier detection algorithm named INOD, which makes use of this feature of the outlier. The DistMean-neighborhood is used to calculate the cumulative in-degree for each data. The data, whose cumulative in-degree is smaller than a threshold, is judged as an outlier candidate. A KNN-based selection algorithm is used to determine the final outlier. Experimental results show that the INOD algorithm can improve the precision 80% higher and decrease the error rate 75% lower than the classical LOF algorithm.


2019 ◽  
Vol 14 (1) ◽  
pp. 21-26 ◽  
Author(s):  
Viswam Subeesh ◽  
Eswaran Maheswari ◽  
Hemendra Singh ◽  
Thomas Elsa Beulah ◽  
Ann Mary Swaroop

Background: The signal is defined as “reported information on a possible causal relationship between an adverse event and a drug, of which the relationship is unknown or incompletely documented previously”. Objective: To detect novel adverse events of iloperidone by disproportionality analysis in FDA database of Adverse Event Reporting System (FAERS) using Data Mining Algorithms (DMAs). Methodology: The US FAERS database consists of 1028 iloperidone associated Drug Event Combinations (DECs) which were reported from 2010 Q1 to 2016 Q3. We consider DECs for disproportionality analysis only if a minimum of ten reports are present in database for the given adverse event and which were not detected earlier (in clinical trials). Two data mining algorithms, namely, Reporting Odds Ratio (ROR) and Information Component (IC) were applied retrospectively in the aforementioned time period. A value of ROR-1.96SE>1 and IC- 2SD>0 were considered as the threshold for positive signal. Results: The mean age of the patients of iloperidone associated events was found to be 44years [95% CI: 36-51], nevertheless age was not mentioned in twenty-one reports. The data mining algorithms exhibited positive signal for akathisia (ROR-1.96SE=43.15, IC-2SD=2.99), dyskinesia (21.24, 3.06), peripheral oedema (6.67,1.08), priapism (425.7,9.09) and sexual dysfunction (26.6-1.5) upon analysis as those were well above the pre-set threshold. Conclusion: Iloperidone associated five potential signals were generated by data mining in the FDA AERS database. The result requires an integration of further clinical surveillance for the quantification and validation of possible risks for the adverse events reported of iloperidone.


Sign in / Sign up

Export Citation Format

Share Document