Area-based conservation is a widely used approach for maintaining biodiversity, and there are ongoing discussions over what is an appropriate global conservation area coverage target. To inform such debates, it is necessary to know the extent and ecological representativeness of the current conservation area network, but this is hampered by gaps in existing global datasets. In particular, although data on privately and community-governed protected areas and other effective area-based conservation measures are often available at the national level, it can take many years to incorporate these into official datasets. This suggests a complementary approach is needed based on selecting a sample of countries and using their national-scale datasets to produce more accurate metrics. However, every country added to the sample increases the costs of data collection, collation and analysis. To address this, here we present a data collection framework underpinned by a spatial prioritization algorithm, which identifies a minimum set of countries that are also representative of 10 factors that influence conservation area establishment and biodiversity patterns. We then illustrate this approach by identifying a representative set of sampling units that cover 10% of the terrestrial realm, which included areas in only 25 countries. In contrast, selecting 10% of the terrestrial realm at random included areas across a mean of 162 countries. These sampling units could be the focus of future data collation on different types of conservation area. Analysing these data could produce more rapid and accurate estimates of global conservation area coverage and ecological representativeness, complementing existing international reporting systems.