Forecasting Undergraduate Majors Using Academic Transcript Data
Abstract: Committing to a major is a fateful step in an undergraduate education, yet the relationship between courses taken early in an academic career and ultimate major selection remains little studied at scale. Using transcript data capturing the academic careers of 26,892 undergraduates enrolled at a private university between 2000 and 2020, we describe enrollment histories using natural-language methods and vector embeddings to forecast terminal major on the basis of course sequences beginning at college entry. We find (I) a student's very first enrolled course predicts major thirty times better than random guessing and more than a third better than majority-class voting, (II) modeling strategies substantially influence forecasting accuracy, and (III) course portfolios varies substantially within majors, raising novel questions what majors mean or signify in relation to undergraduate course histories.