A crosswalk to align the GSS and NIH grants databases based on university names
AbstractNotable reports over the past dozen years have recommended the federal government and others improve data collection on the research workforce in the United States. The federal government already collects a wealth of data, but important datapoints, like the number of biomedical postdocs working in the U.S., for example, are still not well defined. Furthermore, of the data that are collected, differences in collection method and data naming conventions, like inconsistent naming of universities, hinders our ability to merge information across different datasets. Here I describe the creation of three macros meant to align the National Center for Science and Engineering Statistics’ Survey of Graduate Students and Postdoctorates in Science and Engineering (GSS) with NIH grants databases based on consolidating university names under a single name common to both databases. Aligning these databases will allow for a deeper understanding of how various federal and university policies affect the number trainees, grants, and funding at individual institutions.