This product "Digital spatial data for observed, predicted, and misclassification errors for observations in the training dataset for nitrate and arsenic concentrations in basin-fill aquifers in the Southwest Principal Aquifers study area" is a 1:250,000-scale point spatial dataset developed as part of a regional Southwest Principal Aquifers (SWPA) study (Anning and others, 2012). The study examined the vulnerability of basin-fill aquifers in the southwestern United States to nitrate contamination and arsenic enrichment. Statistical models were developed by using the random forest classifier algorithm to predict concentrations of nitrate and arsenic across a model grid that represents local- and basin-scale measures of source, aquifer susceptibility, and geochemical conditions.