OkCupid Data for Introductory Statistics and Data Science Courses

AbstractMassive open online courses (MOOCs) provide hundreds of students with teaching materials, assessment tools, and collaborative instruments. The assessment activity, in particular, is demanding in terms of both time and effort; thus, the use of artificial intelligence can be useful to address and reduce the time and effort required. This paper reports on a system and related experiments finalised to improve both the performance and quality of formative and summative assessments in specific data science courses. The system is developed to automatically grade assignments composed of R commands commented with short sentences written in natural language. In our opinion, the use of the system can (i) shorten the correction times and reduce the possibility of errors and (ii) support the students while solving the exercises assigned during the course through automated feedback. To investigate these aims, an ad-hoc experiment was conducted in three courses containing the specific topic of statistical analysis of health data. Our evaluation demonstrated that automated grading has an acceptable correlation with human grading. Furthermore, the students who used the tool did not report usability issues, and those that used it for more than half of the exercises obtained (on average) higher grades in the exam. Finally, the use of the system reduced the correction time and assisted the professor in identifying correction errors.

Download Full-text

Impact of Experimenting Computational Statistics for Data Science in Introductory Statistics Course: A Malaysian Perspective

Universal Journal of Educational Research ◽

10.13189/ujer.2020.080234 ◽

2020 ◽

Vol 8 (2) ◽

pp. 616-621

Author(s):

Kathiresan Gopal ◽

Dr. NurRaidahSalim ◽

Dr. AhmadFauziMohdAyub

Keyword(s):

Data Science ◽

Introductory Statistics ◽

Computational Statistics ◽

Statistics Course

Download Full-text

ANALYSIS OF DATA SCIENCE COURSES THROUGH THE PRISM OF THE DIGITAL DIVIDE

EDULEARN18 Proceedings ◽

10.21125/edulearn.2018.0993 ◽

2018 ◽

Author(s):

Iva Kostadinova ◽

Pepa Petrova ◽

Evtim Iliev

Keyword(s):

Digital Divide ◽

Data Science ◽

Science Courses

Download Full-text

Teaching Statistics

10.1093/oso/9780198785699.001.0001 ◽

2017 ◽

Cited By ~ 3

Author(s):

Andrew Gelman ◽

Deborah Nolan

Keyword(s):

Social Sciences ◽

Data Science ◽

Teaching Experience ◽

Introductory Statistics ◽

Social Scientists ◽

What Works ◽

Years Of Teaching Experience ◽

Statistics Course ◽

Set Up ◽

Selection Of

Students in the sciences, economics, social sciences, and medicine take an introductory statistics course. And yet statistics can be notoriously difficult for instructors to teach and for students to learn. To help overcome these challenges, Gelman and Nolan have put together this fascinating and thought-provoking book. Based on years of teaching experience the book provides a wealth of demonstrations, activities, examples and projects that involve active student participation. Part I of the book presents a large selection of activities for introductory statistics courses and has chapters such as ‘First week of class’- with exercises to break the ice and get students talking; then descriptive statistics, graphics, linear regression, data collection (sampling and experimentation), probability, inference, and statistical communication. Part II gives tips on what works and what doesn’t, how to set up effective demonstrations, how to encourage students to participate in class and to work effectively in group projects. Course plans for introductory statistics, statistics for social scientists, and communication and graphics are provided. Part III presents material for more advanced courses on topics such as decision theory, Bayesian statistics, sampling, and data science.

Download Full-text

CURRICULUM DESIGN FOR STUDENTS IN MATHEMATICS: DATA SCIENCE COURSES AS EXAMPLES

Educational Innovations and Applications ◽

10.35745/ecei2019v2.038 ◽

2019 ◽

Author(s):

Chia Hung Kao ◽

Keyword(s):

Curriculum Design ◽

Data Science ◽

Science Courses

Download Full-text

Implementing version control with Git and GitHub as a learning objective in statistics and data science courses

Journal of Statistics Education ◽

10.1080/10691898.2020.1848485 ◽

2020 ◽

pp. 1-35

Author(s):

Matthew D. Beckman ◽

Mine Çetinkaya-Rundel ◽

Nicholas J. Horton ◽

Colin W. Rundel ◽

Adam J. Sullivan ◽

...

Keyword(s):

Data Science ◽

Learning Objective ◽

Version Control ◽

Science Courses

Download Full-text

Big Data and Data Science: Opportunities and Challenges of iSchools

Journal of Data and Information Science ◽

10.1515/jdis-2017-0011 ◽

2017 ◽

Vol 2 (3) ◽

pp. 1-18 ◽

Cited By ~ 12

Author(s):

Il-Yeol Song ◽

Yongjun Zhu

Keyword(s):

Big Data ◽

Science Education ◽

Data Science ◽

Computational Thinking ◽

Building Blocks ◽

Digital Transformation ◽

Problem Solving Skills ◽

Science Courses ◽

Big Picture ◽

Eye Opening

AbstractDue to the recent explosion of big data, our society has been rapidly going through digital transformation and entering a new world with numerous eye-opening developments. These new trends impact the society and future jobs, and thus student careers. At the heart of this digital transformation is data science, the discipline that makes sense of big data. With many rapidly emerging digital challenges ahead of us, this article discusses perspectives on iSchools’ opportunities and suggestions in data science education. We argue that iSchools should empower their students with “information computing” disciplines, which we define as the ability to solve problems and create values, information, and knowledge using tools in application domains. As specific approaches to enforcing information computing disciplines in data science education, we suggest the three foci of user-based, tool-based, and application-based. These three foci will serve to differentiate the data science education of iSchools from that of computer science or business schools. We present a layered Data Science Education Framework (DSEF) with building blocks that include the three pillars of data science (people, technology, and data), computational thinking, data-driven paradigms, and data science lifecycles. Data science courses built on the top of this framework should thus be executed with user-based, tool-based, and application-based approaches. This framework will help our students think about data science problems from the big picture perspective and foster appropriate problem-solving skills in conjunction with broad perspectives of data science lifecycles. We hope the DSEF discussed in this article will help fellow iSchools in their design of new data science curricula.

Download Full-text

Teaching stats for data science

10.7287/peerj.preprints.3205v1 ◽

2017 ◽

Author(s):

Daniel T Kaplan

Keyword(s):

20Th Century ◽

Data Science ◽

Statistics Education ◽

Introductory Statistics ◽

Applied Statistics ◽

Causal Relationships ◽

Chi Squared

The familiar mathematical topics of introductory statistics --- means, proportions, t-tests, normal and t distributions, chi-squared, etc. --- are a product of the first half of the 20th century. Naturally, they reflect the statistical conditions of that era: scarce, e.g. n < 10, data originating in benchtop or agricultural experiments; algorithms communicated via algebraic formulas. Today, applied statistics relates to a different environment: software is the means of algorithmic communication, observational and "unplanned" data are interpreted for causal relationships, and data are large both in n and the number of variables. This change in situation calls for a thorough rethinking of the topics in and approach to statistics education. This paper presents a set of ten organizing blocks for intro stats that are better suited to today's environment.

Download Full-text