Forecast accuracy is typically measured in terms of a given loss function. However, as a consequence of the use of misspecified models in multiple model comparisons, relative forecast rankings are loss function dependent. In order to address this issue, a novel criterion for forecast evaluation that utilizes the entire distribution of forecast errors is introduced. In particular, we introduce the concepts of general-loss (GL) forecast superiority and convex-loss (CL) forecast superiority; and we develop tests for GL (CL) superiority that are based on an out-of-sample generalization of the tests introduced by Linton, Maasoumi, and Whang (2005, Review of Economic Studies 72, 735–765). Our test statistics are characterized by nonstandard limiting distributions, under the null, necessitating the use of resampling procedures to obtain critical values. Additionally, the tests are consistent and have nontrivial local power, under a sequence of local alternatives. The above theory is developed for the stationary case, as well as for the case of heterogeneity that is induced by distributional change over time. Monte Carlo simulations suggest that the tests perform reasonably well in finite samples, and an application in which we examine exchange rate data indicates that our tests can help identify superior forecasting models, regardless of loss function.