scholarly journals OkCupid Data for Introductory Statistics and Data Science Courses

Author(s):  
Albert Y. Kim ◽  
Adriana Escobedo-Land
Author(s):  
Pierpaolo Vittorini ◽  
Stefano Menini ◽  
Sara Tonelli

AbstractMassive open online courses (MOOCs) provide hundreds of students with teaching materials, assessment tools, and collaborative instruments. The assessment activity, in particular, is demanding in terms of both time and effort; thus, the use of artificial intelligence can be useful to address and reduce the time and effort required. This paper reports on a system and related experiments finalised to improve both the performance and quality of formative and summative assessments in specific data science courses. The system is developed to automatically grade assignments composed of R commands commented with short sentences written in natural language. In our opinion, the use of the system can (i) shorten the correction times and reduce the possibility of errors and (ii) support the students while solving the exercises assigned during the course through automated feedback. To investigate these aims, an ad-hoc experiment was conducted in three courses containing the specific topic of statistical analysis of health data. Our evaluation demonstrated that automated grading has an acceptable correlation with human grading. Furthermore, the students who used the tool did not report usability issues, and those that used it for more than half of the exercises obtained (on average) higher grades in the exam. Finally, the use of the system reduced the correction time and assisted the professor in identifying correction errors.


Author(s):  
Andrew Gelman ◽  
Deborah Nolan

Students in the sciences, economics, social sciences, and medicine take an introductory statistics course. And yet statistics can be notoriously difficult for instructors to teach and for students to learn. To help overcome these challenges, Gelman and Nolan have put together this fascinating and thought-provoking book. Based on years of teaching experience the book provides a wealth of demonstrations, activities, examples and projects that involve active student participation. Part I of the book presents a large selection of activities for introductory statistics courses and has chapters such as ‘First week of class’- with exercises to break the ice and get students talking; then descriptive statistics, graphics, linear regression, data collection (sampling and experimentation), probability, inference, and statistical communication. Part II gives tips on what works and what doesn’t, how to set up effective demonstrations, how to encourage students to participate in class and to work effectively in group projects. Course plans for introductory statistics, statistics for social scientists, and communication and graphics are provided. Part III presents material for more advanced courses on topics such as decision theory, Bayesian statistics, sampling, and data science.


Author(s):  
Matthew D. Beckman ◽  
Mine Çetinkaya-Rundel ◽  
Nicholas J. Horton ◽  
Colin W. Rundel ◽  
Adam J. Sullivan ◽  
...  

2017 ◽  
Vol 2 (3) ◽  
pp. 1-18 ◽  
Author(s):  
Il-Yeol Song ◽  
Yongjun Zhu

AbstractDue to the recent explosion of big data, our society has been rapidly going through digital transformation and entering a new world with numerous eye-opening developments. These new trends impact the society and future jobs, and thus student careers. At the heart of this digital transformation is data science, the discipline that makes sense of big data. With many rapidly emerging digital challenges ahead of us, this article discusses perspectives on iSchools’ opportunities and suggestions in data science education. We argue that iSchools should empower their students with “information computing” disciplines, which we define as the ability to solve problems and create values, information, and knowledge using tools in application domains. As specific approaches to enforcing information computing disciplines in data science education, we suggest the three foci of user-based, tool-based, and application-based. These three foci will serve to differentiate the data science education of iSchools from that of computer science or business schools. We present a layered Data Science Education Framework (DSEF) with building blocks that include the three pillars of data science (people, technology, and data), computational thinking, data-driven paradigms, and data science lifecycles. Data science courses built on the top of this framework should thus be executed with user-based, tool-based, and application-based approaches. This framework will help our students think about data science problems from the big picture perspective and foster appropriate problem-solving skills in conjunction with broad perspectives of data science lifecycles. We hope the DSEF discussed in this article will help fellow iSchools in their design of new data science curricula.


2017 ◽  
Author(s):  
Daniel T Kaplan

The familiar mathematical topics of introductory statistics --- means, proportions, t-tests, normal and t distributions, chi-squared, etc. --- are a product of the first half of the 20th century. Naturally, they reflect the statistical conditions of that era: scarce, e.g. n < 10, data originating in benchtop or agricultural experiments; algorithms communicated via algebraic formulas. Today, applied statistics relates to a different environment: software is the means of algorithmic communication, observational and "unplanned" data are interpreted for causal relationships, and data are large both in n and the number of variables. This change in situation calls for a thorough rethinking of the topics in and approach to statistics education. This paper presents a set of ten organizing blocks for intro stats that are better suited to today's environment.


Sign in / Sign up

Export Citation Format

Share Document