Teaching stats for data science
The familiar mathematical topics of introductory statistics --- means, proportions, t-tests, normal and t distributions, chi-squared, etc. --- are a product of the first half of the 20th century. Naturally, they reflect the statistical conditions of that era: scarce, e.g. n < 10, data originating in benchtop or agricultural experiments; algorithms communicated via algebraic formulas. Today, applied statistics relates to a different environment: software is the means of algorithmic communication, observational and "unplanned" data are interpreted for causal relationships, and data are large both in n and the number of variables. This change in situation calls for a thorough rethinking of the topics in and approach to statistics education. This paper presents a set of ten organizing blocks for intro stats that are better suited to today's environment.