Workshop Preview: Data Analytics and Machine Learning Hackathon 2021: A deep dive into the open-source data challenge for E&P

2021 ◽  
Vol 40 (1) ◽  
pp. 68-71
Author(s):  
Haibin Di ◽  
Anisha Kaul ◽  
Leigh Truelove ◽  
Weichang Li ◽  
Wenyi Hu ◽  
...  

We present a data challenge as part of the hackathon planned for the August 2021 SEG Research Workshop on Data Analytics and Machine Learning for Exploration and Production. The hackathon aims to provide hands-on machine learning experience for beginners and advanced practitioners, using a relatively well-defined problem and a carefully curated data set. The seismic data are from New Zealand's Taranaki Basin. The labels for a subset of the data have been generated by an experienced geologist. The objective of the challenge is to develop innovative machine learning solutions to identify key horizons.

2021 ◽  
Vol 17 (3) ◽  
pp. e1008671
Author(s):  
Janez Demšar ◽  
Blaž Zupan

Overfitting is one of the critical problems in developing models by machine learning. With machine learning becoming an essential technology in computational biology, we must include training about overfitting in all courses that introduce this technology to students and practitioners. We here propose a hands-on training for overfitting that is suitable for introductory level courses and can be carried out on its own or embedded within any data science course. We use workflow-based design of machine learning pipelines, experimentation-based teaching, and hands-on approach that focuses on concepts rather than underlying mathematics. We here detail the data analysis workflows we use in training and motivate them from the viewpoint of teaching goals. Our proposed approach relies on Orange, an open-source data science toolbox that combines data visualization and machine learning, and that is tailored for education in machine learning and explorative data analysis.


2020 ◽  
Vol 2 (1) ◽  
pp. 42
Author(s):  
Steve Leichtweis

Universities are increasingly being expected to ensure student success while at the same time delivering larger courses.  Within this environment, the provision of effective and timely feedback to students and creating opportunities for genuine engagement between teachers and students is increasingly difficult if not impossible for many instructors, despite the known value and importance of feedback (Timperley & Hattie, 2007) and instructor presence (Garrison, Anderson & Archer, 2010).  Similar to other tertiary institutions, the University of Auckland has adopted various technology-enhanced learning approaches and technologies, including learning analytics in an attempt to support teaching and learning at scale.  The increased use of educational technology to support learning provides a variety of data sources for teachers to provide personalised feedback and improve the overall learning experience for students.  This workshop is targeted to teachers interested in the use of learning data to provide personalized support to learners.  Participants will have a hands-on opportunity to use the open-source tool OnTask (Pardo, et al. 2018) within some common teaching scenarios with a synthetically generated data set.  The facilitators will also share and discuss how OnTask is currently being used in universities to support student experience, teaching practice and course design.  As this is a hands-on workshop, participants must bring a laptop computer to work with the online tool and the prepared scenarios.  References   Garrison, D. R., Anderson, T., & Archer, W. (2010). The first decade of the community of inquiry framework: A retrospective. The internet and higher education, 13(1-2), 5-9. Hattie, J., & Timperley, H. (2007). The power of feedback. Review of educational research, 77(1), 81-112. Pardo, A., Bartimote-Aufflick, K., Shum, S. B., Dawson, S., Gao, J., Gaševic, D., Leichtweis, S., Liu, D., Martínez-Maldonado, R., Mirriahi, N. and Moskal, A. C. M. (2018). OnTask: Delivering Data-Informed, Personalized Learning Support Actions. Journal of Learning Analytics, 5(3), 235-249.


2019 ◽  
Vol 9 (17) ◽  
pp. 3558 ◽  
Author(s):  
Jinying Yu ◽  
Yuchen Gao ◽  
Yuxin Wu ◽  
Dian Jiao ◽  
Chang Su ◽  
...  

Non-intrusive load monitoring (NILM) is a core technology for demand response (DR) and energy conservation services. Traditional NILM methods are rarely combined with practical applications, and most studies aim to disaggregate the whole loads in a household, which leads to low identification accuracy. In this method, the event detection method is used to obtain the switching event sets of all loads, and the power consumption curves of independent unknown electrical appliances in a period are disaggregated by utilizing comprehensive features. A linear discriminant classifier group based on multi-feature global similarity is used for load identification. The uniqueness of our algorithm is that it designs an event detector based on steady-state segmentation and a linear discriminant classifier group based on multi-feature global similarity. The simulation is carried out on an open source data set. The results demonstrate the effectiveness and high accuracy of the multi-feature integrated classification (MFIC) algorithm by using the state-of-the-art NILM methods as benchmarks.


2016 ◽  
Vol 81 (3) ◽  
pp. 1929-1956 ◽  
Author(s):  
Robert Stewart ◽  
Marie Urban ◽  
Samantha Duchscherer ◽  
Jason Kaufman ◽  
April Morton ◽  
...  

2019 ◽  
Vol 8 (3) ◽  
pp. 1572-1580

Tourism is one of the most important sectors contributing towards the economic growth of India. Big data analytics in the recent times is being applied in the tourism sector for the activities like tourism demand forecasting, prediction of interests of tourists’, identification of tourist attraction elements and behavioural patterns. The major objective of this study is to demonstrate how big data analytics could be applied in predicting the travel behaviour of International and Domestic tourists. The significance of machine learning algorithms and techniques in processing the big data is also important. Thus, the combination of machine learning and big data is the state-of-art method which has been acclaimed internationally. While big data analytics and its application with respect to the tourism industry has attracted few researchers interest in the present times, there have been not much researches on this area of study particularly with respect to the scenario of India. This study intends to describe how big data analytics could be used in forecasting Indian tourists travel behaviour. To add much value to the research this study intends to categorize on what grounds the tourists chose domestic tourism and on what grounds they chose international tourism. The online datasets on places reviews from cities namely Chicago, Beijing, New York, Dubai, San Francisco, London, New Delhi and Shanghai have been gathered and an associative rule mining based algorithm has been applied on the data set in order to attain the objectives of the study


2019 ◽  
Vol 8 (4) ◽  
pp. 7356-7360

Data Analytics is a scientific as well as an engineering tool used to investigate the raw data to revamp the information to achieve knowledge. This is normally connected with obtaining knowledge from reliable information source and rapidity in information processing, and future prediction of the data analysis. Big Data analytics is strongly evolving with different features of volume, velocity and Vectors. Most of the organizations are now concentrating on analyzing information or raw data that are fascinated in deploying analytics to survive forthcoming issues and challenges. The prediction model or intelligent model is proposed in this research to apply machine learning algorithms in the data set. Then it is interpreted and to analyze the better forecast value of the study. The major objective of this research work is to find the optimum prediction from the medical data set using the machine learning techniques.


2019 ◽  
Vol 64 (1) ◽  
pp. 97-117 ◽  
Author(s):  
William A. Donohue ◽  
Qi Hao ◽  
Richard Spreng ◽  
Charles Owen

The purpose of this article is to illustrate innovations in text analysis associated with understanding conflict-related communication events. Two innovations will be explored: LIWC (Linguistic Inquiry and Word Count), the text modeling program from the open-source data analysis software program R, and SPSS Modeler. The LIWC analysis revisits the 2009 study by Donohue and Druckman and the 2014 study by Donohue, Liang, and Druckman focusing on text analyses of the Oslo I Accords between the Palestinians and Israelis to illustrate this approach. The R and SPSS modeling of text analysis use the same data set as the LIWC analysis to provide a different set of pictures associated with each leader’s rhetoric during the period in which the Oslo I accords were being negotiated. Each innovation provides different insights into the mind-set of the two groups of leaders as the secret talks were emerging. The implications of each approach in establishing an understanding of the communication exchanges are discussed to conclude the article.


Sign in / Sign up

Export Citation Format

Share Document