Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records

Citation: Imai, K., & Khanna, K. (2016). Improving ecological inference by predicting individual ethnicity from voter registration records. Political Analysis24(2), 263-272.

Abstract: In academic research on racial politics and voting rights litigation, one must infer turnout and vote choice for each racial group using aggregate election results and racial composition. Over the last several decades, a number of statistical methods have been proposed to address this ecological inference problem. We show how to reduce aggregation bias by predicting individual-level ethnicity from voter registration records. Building on the existing methodological literature, we show how to combine Census Bureau's Surname List with the various information from geocoded voter registration records via Bayes’ rule. We evaluate the performance of the proposed methodology using approximately nine million voter registration records in Florida, where self-reported ethnicity is available. We find that it is possible to reduce the false positive rate among Black and Latino voters to 6% and 3%, respectively, while maintaining the true positive rate at over 80%. Moreover, we use our predictions to estimate turnout by race and find that our estimates result in substantially less amounts of bias and root mean squared error than standard ecological inference estimates.

Links: full article, online appendix, replication materials, software package

ROC.png