Generation of Automatic Data-Driven Feedback to Students Using Explainable Machine Learning

Author(s):  
Muhammad Afzaal ◽  
Jalal Nouri ◽  
Aayesha Zia ◽  
Panagiotis Papapetrou ◽  
Uno Fors ◽  
...  
2021 ◽  
Vol 11 (23) ◽  
pp. 11429
Author(s):  
Jurgen van den Hoogen ◽  
Stefan Bloemheuvel ◽  
Martin Atzmueller

With the developments in improved computation power and the vast amount of (automatic) data collection, industry has become more data-driven. These data-driven approaches for monitoring processes and machinery require different modeling methods focusing on automated learning and deployment. In this context, deep learning provides possibilities for industrial diagnostics to achieve improved performance and efficiency. These deep learning applications can be used to automatically extract features during training, eliminating time-consuming feature engineering and prior understanding of sophisticated (signal) processing techniques. This paper extends on previous work, introducing one-dimensional (1D) CNN architectures that utilize an adaptive wide-kernel layer to improve classification of multivariate signals, e.g., time series classification in fault detection and condition monitoring context. We used multiple prominent benchmark datasets for rolling bearing fault detection to determine the performance of the proposed wide-kernel CNN architectures in different settings. For example, distinctive experimental conditions were tested with deviating amounts of training data. We shed light on the performance of these models compared to traditional machine learning applications and explain different approaches to handle multivariate signals with deep learning. Our proposed models show promising results for classifying different fault conditions of rolling bearing elements and their respective machine condition, while using a fairly straightforward 1D CNN architecture with minimal data preprocessing. Thus, using a 1D CNN with an adaptive wide-kernel layer seems well-suited for fault detection and condition monitoring. In addition, this paper clearly indicates the high potential performance of deep learning compared to traditional machine learning, particularly in complex multivariate and multi-class classification tasks.


Animals ◽  
2020 ◽  
Vol 10 (4) ◽  
pp. 747
Author(s):  
Federica Borgonovo ◽  
Valentina Ferrante ◽  
Guido Grilli ◽  
Riccardo Pascuzzo ◽  
Simone Vantini ◽  
...  

Coccidiosis is still one of the major parasitic infections in poultry. It is caused by protozoa of the genus Eimeria, which cause concrete economic losses due to malabsorption, bad feed conversion rate, reduced weight gain, and increased mortality. The greatest damage is registered in commercial poultry farms because birds are reared together in large numbers and high densities. Unfortunately, these enteric pathologies are not preventable, and their diagnosis is only available when the disease is full-blown. For these reasons, the preventive use of anticoccidials—some of these with antimicrobial action—is a common practice in intensive farming, and this type of management leads to the release of drugs in the environment which contributes to the phenomenon of antibiotic resistance. Due to the high relevance of this issue, the early detection of any health problem is of great importance to improve animal welfare in intensive farming. Three prototypes, previously calibrated and adjusted, were developed and tested in three different experimental poultry farms in order to evaluate whether the system was able to identify the coccidia infection in intensive poultry farms early. For this purpose, a data-driven machine learning algorithm was built, and specific critical values of volatile organic compounds (VOCs) were found to be associated with abnormal levels of oocystis count at an early stage of the disease. This result supports the feasibility of building an automatic data-driven machine learning algorithm for an early warning of coccidiosis.


Author(s):  
Ekaterina Kochmar ◽  
Dung Do Vu ◽  
Robert Belfer ◽  
Varun Gupta ◽  
Iulian Vlad Serban ◽  
...  

AbstractIntelligent tutoring systems (ITS) have been shown to be highly effective at promoting learning as compared to other computer-based instructional approaches. However, many ITS rely heavily on expert design and hand-crafted rules. This makes them difficult to build and transfer across domains and limits their potential efficacy. In this paper, we investigate how feedback in a large-scale ITS can be automatically generated in a data-driven way, and more specifically how personalization of feedback can lead to improvements in student performance outcomes. First, in this paper we propose a machine learning approach to generate personalized feedback in an automated way, which takes individual needs of students into account, while alleviating the need of expert intervention and design of hand-crafted rules. We leverage state-of-the-art machine learning and natural language processing techniques to provide students with personalized feedback using hints and Wikipedia-based explanations. Second, we demonstrate that personalized feedback leads to improved success rates at solving exercises in practice: our personalized feedback model is used in , a large-scale dialogue-based ITS with around 20,000 students launched in 2019. We present the results of experiments with students and show that the automated, data-driven, personalized feedback leads to a significant overall improvement of 22.95% in student performance outcomes and substantial improvements in the subjective evaluation of the feedback.


Water ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 1208
Author(s):  
Massimiliano Bordoni ◽  
Fabrizio Inzaghi ◽  
Valerio Vivaldi ◽  
Roberto Valentino ◽  
Marco Bittelli ◽  
...  

Soil water potential is a key factor to study water dynamics in soil and for estimating the occurrence of natural hazards, as landslides. This parameter can be measured in field or estimated through physically-based models, limited by the availability of effective input soil properties and preliminary calibrations. Data-driven models, based on machine learning techniques, could overcome these gaps. The aim of this paper is then to develop an innovative machine learning methodology to assess soil water potential trends and to implement them in models to predict shallow landslides. Monitoring data since 2012 from test-sites slopes in Oltrepò Pavese (northern Italy) were used to build the models. Within the tested techniques, Random Forest models allowed an outstanding reconstruction of measured soil water potential temporal trends. Each model is sensitive to meteorological and hydrological characteristics according to soil depths and features. Reliability of the proposed models was confirmed by correct estimation of days when shallow landslides were triggered in the study areas in December 2020, after implementing the modeled trends on a slope stability model, and by the correct choice of physically-based rainfall thresholds. These results confirm the potential application of the developed methodology to estimate hydrological scenarios that could be used for decision-making purposes.


Atmosphere ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 109
Author(s):  
Ashima Malik ◽  
Megha Rajam Rao ◽  
Nandini Puppala ◽  
Prathusha Koouri ◽  
Venkata Anil Kumar Thota ◽  
...  

Over the years, rampant wildfires have plagued the state of California, creating economic and environmental loss. In 2018, wildfires cost nearly 800 million dollars in economic loss and claimed more than 100 lives in California. Over 1.6 million acres of land has burned and caused large sums of environmental damage. Although, recently, researchers have introduced machine learning models and algorithms in predicting the wildfire risks, these results focused on special perspectives and were restricted to a limited number of data parameters. In this paper, we have proposed two data-driven machine learning approaches based on random forest models to predict the wildfire risk at areas near Monticello and Winters, California. This study demonstrated how the models were developed and applied with comprehensive data parameters such as powerlines, terrain, and vegetation in different perspectives that improved the spatial and temporal accuracy in predicting the risk of wildfire including fire ignition. The combined model uses the spatial and the temporal parameters as a single combined dataset to train and predict the fire risk, whereas the ensemble model was fed separate parameters that were later stacked to work as a single model. Our experiment shows that the combined model produced better results compared to the ensemble of random forest models on separate spatial data in terms of accuracy. The models were validated with Receiver Operating Characteristic (ROC) curves, learning curves, and evaluation metrics such as: accuracy, confusion matrices, and classification report. The study results showed and achieved cutting-edge accuracy of 92% in predicting the wildfire risks, including ignition by utilizing the regional spatial and temporal data along with standard data parameters in Northern California.


Sign in / Sign up

Export Citation Format

Share Document