A Letter to the Journal of Statistics and Data Science Education — A Call for Review of “OkCupid Data for Introductory Statistics and Data Science Courses” by Albert Y. Kim and Adriana Escobedo-Land

AbstractDue to the recent explosion of big data, our society has been rapidly going through digital transformation and entering a new world with numerous eye-opening developments. These new trends impact the society and future jobs, and thus student careers. At the heart of this digital transformation is data science, the discipline that makes sense of big data. With many rapidly emerging digital challenges ahead of us, this article discusses perspectives on iSchools’ opportunities and suggestions in data science education. We argue that iSchools should empower their students with “information computing” disciplines, which we define as the ability to solve problems and create values, information, and knowledge using tools in application domains. As specific approaches to enforcing information computing disciplines in data science education, we suggest the three foci of user-based, tool-based, and application-based. These three foci will serve to differentiate the data science education of iSchools from that of computer science or business schools. We present a layered Data Science Education Framework (DSEF) with building blocks that include the three pillars of data science (people, technology, and data), computational thinking, data-driven paradigms, and data science lifecycles. Data science courses built on the top of this framework should thus be executed with user-based, tool-based, and application-based approaches. This framework will help our students think about data science problems from the big picture perspective and foster appropriate problem-solving skills in conjunction with broad perspectives of data science lifecycles. We hope the DSEF discussed in this article will help fellow iSchools in their design of new data science curricula.

Download Full-text

An AI-Based System for Formative and Summative Assessment in Data Science Courses

International Journal of Artificial Intelligence in Education ◽

10.1007/s40593-020-00230-2 ◽

2020 ◽

Author(s):

Pierpaolo Vittorini ◽

Stefano Menini ◽

Sara Tonelli

Keyword(s):

Data Science ◽

Ad Hoc ◽

Online Courses ◽

Assessment Tools ◽

Science Courses ◽

Summative Assessments ◽

Assessment Activity ◽

Automated Grading ◽

Automated Feedback

AbstractMassive open online courses (MOOCs) provide hundreds of students with teaching materials, assessment tools, and collaborative instruments. The assessment activity, in particular, is demanding in terms of both time and effort; thus, the use of artificial intelligence can be useful to address and reduce the time and effort required. This paper reports on a system and related experiments finalised to improve both the performance and quality of formative and summative assessments in specific data science courses. The system is developed to automatically grade assignments composed of R commands commented with short sentences written in natural language. In our opinion, the use of the system can (i) shorten the correction times and reduce the possibility of errors and (ii) support the students while solving the exercises assigned during the course through automated feedback. To investigate these aims, an ad-hoc experiment was conducted in three courses containing the specific topic of statistical analysis of health data. Our evaluation demonstrated that automated grading has an acceptable correlation with human grading. Furthermore, the students who used the tool did not report usability issues, and those that used it for more than half of the exercises obtained (on average) higher grades in the exam. Finally, the use of the system reduced the correction time and assisted the professor in identifying correction errors.

Download Full-text