Climate data obtained from global climate models (GCMs) form the basis of most studies of regional climate change and its impacts. Using the northeastern US as a test case, we develop a framework to systematically sub-select reliable models for use in climate change studies in the region. We retain 14 of 36 CMIP5 GCMs that (a) have satisfactory historical performance, and (b) provide diverse climate scenarios consistent with uncertainties in the multi-model ensemble (MME). The historical performance is evaluated for a wide variety of standard and process metrics including large-scale atmospheric circulation features that drive regional climate variability. Model performance is then used in conjunction with the assessment of diversity and redundancy in model projections to eliminate models without underrepresenting the uncertainty in the MME. Overall, the models show significant variations in their performance across metrics and seasons with none emerging as the best model. This combined with a lack of a strong relationship between model biases and future projections together highlight the importance of maintaining diversity in projections for risk assessment. The summer mean precipitation projections, in particular, are uncertain but also have considerable redundancy in their spatial patterns within the ensemble, which we use effectively to eliminate models. The better performing models in the retained set do suggest a potential to narrow the ranges in temperature and precipitation projections. But any further refinement should be based on a detailed analysis of the physical processes that drive regional climate variability and extremes to avoid providing overconfident projections.