Validation of a province-wide commercial food store dataset in a heterogeneous predominantly rural food environment

Nathan GA Taylor; Jillian Stymest; Catherine L Mah

doi:10.1017/S1368980019004506

Validation of a province-wide commercial food store dataset in a heterogeneous predominantly rural food environment

Published online by Cambridge University Press: 16 April 2020

Nathan GA Taylor ,

Jillian Stymest and

Catherine L Mah

Show author details

Nathan GA Taylor*: Affiliation:
Faculty of Health, School of Health Administration, Dalhousie University, 5850 College Street, PO Box 15000, Halifax, NS B3H 4R2, Canada
Jillian Stymest: Affiliation:
Faculty of Health, School of Health Administration, Dalhousie University, 5850 College Street, PO Box 15000, Halifax, NS B3H 4R2, Canada
Catherine L Mah: Affiliation:
Faculty of Health, School of Health Administration, Dalhousie University, 5850 College Street, PO Box 15000, Halifax, NS B3H 4R2, Canada Dalla Lana School of Public Health, University of Toronto, 155 College Street, Toronto, ON M5T 3M7, Canada
*: *Corresponding author: Email nathan.taylor@dal.ca

Article contents

Abstract
Objective:
Design:
Setting:
Participants:
Results:
Conclusions:
Methods
Results
Discussion
Limitations
Conclusion
References

Rights & Permissions

Abstract

Objective:

Commercially available business (CAB) datasets for food environments have been investigated for error in large urban contexts and some rural areas, but there is a relative dearth of literature that reports error across regions of variable rurality. The objective of the current study was to assess the validity of a CAB dataset using a government dataset at the provincial scale.

Design:

A ground-truthed dataset provided by the government of Newfoundland and Labrador (NL) was used to assess a popular commercial dataset. Concordance, sensitivity, positive-predictive value (PPV) and geocoding errors were calculated. Measures were stratified by store types and rurality to investigate any association between these variables and database accuracy.

Setting:

NL, Canada.

Participants:

The current analysis used store-level (ecological) data.

Results:

Of 1125 stores, there were 380 stores that existed in both datasets and were considered true-positive stores. The mean positional error between a ground-truthed and test point was 17·72 km. When compared with the provincial dataset of businesses, grocery stores had the greatest agreement, sensitivity = 0·64, PPV = 0·60 and concordance = 0·45. Gas stations had the least agreement, sensitivity = 0·26, PPV = 0·32 and concordance = 0·17. Only 4 % of commercial data points in rural areas matched every criterion examined.

Conclusions:

The commercial dataset exhibits a low level of agreement with the ground-truthed provincial data. Particularly retailers in rural areas or belonging to the gas station category suffered from misclassification and/or geocoding errors. Taken together, the commercial dataset is differentially representative of the ground-truthed reality based on store-type and rurality/urbanity.

Keywords

Food environment Commercial data Secondary data Validation Food retail Store-type Rurality Canada

Type: Short Communication
Information: Public Health Nutrition , Volume 23 , Issue 11 , August 2020 , pp. 1889 - 1895

DOI: https://doi.org/10.1017/S1368980019004506 [Opens in a new window]
Copyright: © The Authors 2020

Diet-related causes are the leading risk factor for death and disability globally^{(Reference Soriano, Abajobir and Abate1)}, and as of 2017, responsible for 11 million deaths and 255 million disability-adjusted life years^{(Reference Afshin, John Sur and Fay2)}. In an effort to address this growing burden, dietary behaviour research has focused on associations between retail food environment and diet^{(Reference Kleinert and Horton3)}, including how the community distribution of retailers, and consumer factors such as food price, access, availability and promotion can influence diet and disease^{(Reference Giskes, van Lenthe and Avendano-Pabon4–Reference Glanz, Sallis and Saelens6)}. Within and across studies, there remains a great deal of heterogeneity in terms of methods, outcomes and data quality^{(Reference McKinnon, Reedy and Morrissette7–Reference Lytle and Sokol9)}.

The majority of food environment studies have used geographic information systems to process community-level data^{(Reference McKinnon, Reedy and Morrissette7–Reference Lytle and Sokol9)}. An important methodological issue in this literature is the use of commercially produced, unvalidated secondary datasets in the assessment of community food environments and a source of potential misclassification bias^{(Reference Lebel, Daepp and Block10,Reference Fleischhacker, Evenson and Sharkey11)} . For instance, in a recent case study, Lebel et al. noted variable levels of correlation between per capita exposures defined by commercially available business (CAB) data and government data when stratified by store type^{(Reference Lebel, Daepp and Block10)}.

Studies of CAB data are often structured as a comparison between CAB data and a ‘gold-standard’ government and/or primary dataset^{(Reference Lebel, Daepp and Block10,Reference Fleischhacker, Evenson and Sharkey11)} . These studies have applied conventional epidemiological diagnostic measures such as sensitivity and positive-predictive value (PPV)^{(Reference Trevethan12)} to assess the accuracy of a CAB dataset^{(Reference Lebel, Daepp and Block10)}. Previous validation studies in the North American context have demonstrated a wide range of agreement between CAB and other datasets^{(Reference Lebel, Daepp and Block10,Reference Clary and Kestens13,Reference Daepp and Black14)} . While some studies have reported high levels of agreement between commercial and governmental datasets in urban centres^{(Reference Lebel, Daepp and Block10)}, others have indicated that government datasets are less error-prone and may be better for specific food environment measures^{(Reference Daepp and Black14)}. As the literature has grown, data accuracy in rural contexts has increasingly been studied^{(Reference Lebel, Daepp and Block10,Reference Fleischhacker, Evenson and Sharkey11)} . For example, density measures^{(Reference Lebel, Daepp and Block10)} and representativity^{(Reference Clary and Kestens13)} are advanced diagnostic analyses that aim to assess the impact of data accuracy in relation to disease risk; these newer analyses can further highlight the differential impact of error on rural and urban exposures.

There is a relative dearth of literature that explores the rates of error across regions of variable rurality and stratifies by store type in Canada. Although several studies have focused on rural regions in the USA^{(Reference Sharkey15–Reference McGuirt, Jilcott and Vu19)} and UK^{(Reference Lake, Burgoine and Stamp20–Reference Wilkins, Radley and Morris22)}, rural Canada remains an understudied jurisdiction^{(Reference Minaker, Shuh and Olstad23)} with disproportionate levels of diet-related risk factors and poor health^{(Reference Bruner, Lawson and Pickett24–Reference Shearer, Blanchard and Kirk27)}. The objective of this paper is to compare government and commercial datasets at the provincial scale using diagnostic measures of agreement, across a spectrum of population centres, within industry-defined store classifications and to report associated rates of geospatial error.

Methods

Setting

Newfoundland and Labrador (NL) is the easternmost province of Canada. Residents of NL report among the largest burden of diet-related noncommunicable diseases^(28–30) and obesity in Canada, and population-based assessments of dietary intake are significantly poorer^{(Reference Garriguet31–35)} than elsewhere.

As of the 2016 census, NL is home to 519 716 people, and the majority reside in rural areas. Our definition of rurality in the current study employs the Statistics Canada population centre classification scheme⁽³⁶⁾. A population centre is a dissemination block or set of contiguous dissemination blocks with population count >1000 and density >400 persons per km² (154 persons per square mile). Large population centres consist of areas with population counts >100 000; medium, 30 000–99 999 and small 1000–9999 people. All areas outside these blocks are classified as rural⁽³⁶⁾. In NL, there are 27 small population centres (approximately 24 % of the population), no medium centres, 1 large centre (approximately 34 % of the population) and the remainder are rural areas (approximately 42 % of the population). For comparison, population proportions for Canada: 13 % residing in small population centres; 9 % medium; 60 % large and 19 % in rural areas.

Data

The enhanced points of interest (EPOI) are a set of geocoded business and recreation points across Canada, compiled by DMTI Spatial. Attributes for this data include North American Industry Classification System (NAICS) and Standard Industrial Classification (SIC) codes, street address and phone number. EPOI 2015 data were provided by the Dalhousie GIS Centre and points located in NL with a primary NAICS code of 445110 for Grocery Stores and Other Grocery, 445120 for Convenience Stores or 447190 and 447110 for Gas Stations were extracted.

EPOI data from the year 2015 were compared with a 2015 dataset from the NL provincial government. The government dataset was administrative data consisting of all licensed food premises in NL as of March 2015, obtained from the department responsible for on-site food safety inspections, governed by food premises legislation. Official government listings are considered a ‘gold standard’ alternative to researcher ground-truthed data^{(Reference Lebel, Daepp and Block10)}. A detailed description of the dataset has been provided previously^{(Reference Mah, Pomeroy and Knox37)}. Briefly, the inventory was cleaned and coded by business ownership type and NAICS code in consultation with government and a subsample verified using Google street view. The same NAICS categories employed for the EPOI data were used to classify the NL data, although it is noteworthy that there was an observable divergence in coded stores (Table 1). Further, it is important to consider that this strategy applied NAICS definitions consistently at all levels of rurality.

Table 1 Classification and geospatial error of true-positive stores when comparing a provincial dataset with the DMTI enhanced points of interest in Newfoundland and Labrador, Canada (n 380)

NAICS, North American Industry Classification System.

Estimates are provided with CI.

* Values include outliers for a discussion of the impacts of outliers on mean estimates sees the results section.

For all following calculations, the government dataset is the gold-standard, and the CAB dataset is the test.

Matching algorithm

Stores in the EPOI layer were matched to stores in the ground-truthed layer by first examining the name of the store. If the name was an exact match, a verification of matching addresses was performed, and the stores were considered matches. If the store names were similar, the address and coordinates were matched; if the addresses were then identical, the stores were considered a match. In the event of missing or incomplete data, the completed fields were verified through the Yellow Pages directory or through manual search engine verification. Stores included in one of the datasets but not the other were matched with a blank record for all fields.

Analysis

To perform our analysis, the CAB dataset was assessed for accuracy with respect to the gold-standard. We cross-tabulated all true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN). Other descriptive statistics were then performed for TP, FP and FN stores. These cross-tabulations were repeated for the data after stratification by store-type, ownership and rurality. Stratification is a commonly used method to assess differences in exposures employed in food environment studies^{(Reference Daepp and Black14)}. Rural–urban classifications as well as store type and size are well-described potential stratifiers in the rural and regional context^{(Reference Cummins, Smith and Aitken38)}, and ownership (chain/independent) is an important emerging attribute given food industry consolidation^{(Reference Wrigley, Coe and Currah39)}.

Similar diagnostic assessments of secondary food environment datasets have been reported in detail elsewhere^{(Reference Daepp and Black14)}. To assess the accuracy of the CAB dataset in relation to the government dataset, three conventional diagnostic indicators were used. Sensitivity calculated as ${{{\rm{TP}}}}\over{{{\rm{TP}} + {\rm{FN}}}}$, PPV calculated as ${{{\rm{TP}}}}\over{{{\rm{TP}} + {\rm{FP}}}}$ and concordance calculated as ${{{\rm{TP}}}}\over{{{\rm{TP}} + {\rm{FP}} + {\rm{FN}}}}$. Diagnostic values of less than 0·001 % (<0·001) were considered poor, below 20 % (95 % CI 0·001, 0·20) slight, 21–40 % (95 % CI 0·21, 0·40) fair, 41–60 % (95 % CI 0·41, 0·60) moderate, 61–80 % (95 % CI 0·61, 0·80) good and over 81 % (95 % CI 0·81, 1·00) almost perfect, similar to a previous report that employed the Landis scale^{(Reference Lebel, Daepp and Block10,Reference Landis and Koch40)} . All CI were calculated by approximating a normal distribution as reported previously^{(Reference Ma, Battersby and Bell41)}.

Finally, of the true-positive stores found in both datasets, values for NAICS code, store community as defined in both datasets (city, town or community name as listed) and geocoded positional accuracy within 100 m by Euclidean distance were compared for the EPOI data and the provincial dataset. To determine geocoded point accuracy, the distance between the provincial dataset point and its corresponding EPOI point was determined with the formula: ${\sqrt {{{({x_1} - \;{x_2})}^2} + {{\left( {{y_1} - \;{y_2}} \right)}^2}} }$. All units were converted to metres. Analyses were performed with and without outliers, any values that fell 1·5 times the interquartile range above the third quartile or below the first quartile were deemed outliers. To avoid multiple regional projections and calculate relatively conservative estimates of error, EPSG:3857 was the coordinate system employed for point assignment and distance analysis.

All spatial analyses were conducted in ArcMap (ESRI), all statistical analyses were conducted in R version 3.4.2 ‘Short Summer’.

Results

Table 2 shows the overall and store-level sample size, sensitivity, PPV and concordance between the commercial and government datasets. Overall agreement between the datasets was fair to moderate (Table 2). Grocery stores demonstrated the greatest agreement between datasets, then convenience stores and finally gas stations. This categorical trend in agreement was observed to be true for all three diagnostic indicators employed in our analysis. Less than 20 % of all gas stations in both datasets were TP.

Table 2 Sensitivity, positive predictive value and concordance between a government dataset and the DMTI enhanced points of interest in Newfoundland and Labrador, Canada (n 1124)

PPV, positive-predictive value.

Estimates are provided with CI.

Table 3 shows the number of TP, FP and FN by store-type (NAICS), ownership (government dataset only) and population centre. The majority by store type was convenience stores and by population centre was rural (Table 3). Unfortunately, due to the lack of ownership information in the secondary dataset, the ownership distribution of FP could not be calculated, but the bulk of both TP and FN were stores that were independently owned.

Table 3 True positives (TP), false positives (FP) and false negatives (FN) based on store type, ownership and population centre class when comparing a provincial dataset with the DMTI enhanced points of interest in Newfoundland and Labrador, Canada (n 1124)

Estimates are provided with CI.

The true-positive store accuracy in terms of NAICS assignments, community agreement, geospatial accuracy and overall agreement is shown in Table 1. Although the majority of true-positive stores were in rural areas (Table 3), the industrial classification and both measures of spatial accuracy were lowest in rural areas (Table 1). Just 4 % of stores in rural areas were classified accurately in terms of spatial location and NAICS assignment.

Small population centres and large population centres had similar sample sizes; both industrial classification agreement and spatial accuracy were poorer for small population centres (Table 1).

Finally, of the 380 true-positive stores, the maximum positional error between a truth and test point was a striking 372 km. The minimum error was <0·001 km, and the mean error was 17·72 km; 11 % of the sample, forty-two values, were 1·5 times the interquartile range above the third quartile (positive skew). Following outlier removal, the maximum positional error fell to 15·95 km and the mean error fell to 2·79 km. A sensitivity analysis of the buffer threshold for geocoding error revealed an approximately logarithmic curve in the percent of stores considered accurately geocoded as the buffer was relaxed from 100 m to approximately 20 km, at which point 90 % of TP stores were considered accurate.

Discussion

This investigation of the validity of a CAB dataset using a government ground-truthed dataset has demonstrated variable levels of accuracy across population centres and industrial classifications. Notably, we observed grocery stores were captured in the CAB dataset with the highest accuracy among store types tested. Stores in large population centres had the greatest levels of spatial and NAICS code agreement across NL.

CAB data are an obvious entry point to food environment research, largely due to its relative ease of access, but our work, confirming and expanding on existing literature in this area in the rural Canada context, suggests that CAB data are prone to error, with potential differential consequences for study outcomes depending on the environment-diet hypothesis under study^{(Reference Lebel, Daepp and Block10,Reference Fleischhacker, Evenson and Sharkey11)} . For instance, research focused on absolute access to major grocery stores in urban centres may find CAB data viable for its purpose. Yet research in rural settings, which may prioritise access to the nearest available grocery outlet – often, small general stores, convenience stores or gas stations – may suffer from associated bias and lead to inconclusive or inexact conclusions regarding the health impacts of store access or lack thereof. This may be one of the myriad reasons that community food environment predictors of diet remain ambiguous^{(Reference McKinnon, Reedy and Morrissette7,Reference Lytle and Sokol9,Reference Minaker, Shuh and Olstad23)} .

In comparison with other Canadian provinces, NL has a relatively high proportion of stores in rural areas and a high proportion of convenience stores⁽⁴²⁾. Despite this, NAICS classification and geocoding accuracy were still higher in both small and large population centres than rural areas. Although few regional assessments of food environments have been conducted at a province-wide scale, several potential explanations may be at play. First, large population centres may have more stable business markets, reducing business turnover and minimising the associated error with openings/closings. Second, if the CAB dataset geocoding strategy is based on street address and/or address proxies such as postal codes, the spatial algorithms that assign business locations may suffer in sparsely connected rural regions^{(Reference Khan, Pinault and Tjepkema43)} or differ by commercial vendors^{(Reference Whitsel, Rose and Wood44)}. Indeed, a potentially critical difference between CAB datasets and government datasets may be the address files employed to geocode business locations. Measures can be taken to improve geocoding, but these measures are only effective if the underlying business information and geocoding address files are contemporaneously accurate^{(Reference McDonald, Schwind and Goldberg45,Reference Faure, Danjou and Clavel-Chapelon46)} . Third, if our observation that grocery stores are captured by the CAB with the highest levels of accuracy is related to random and not systematic error, the decreased accuracy in rural areas may reflect a relative dearth of grocery stores in the jurisdiction.

Varying degrees of CAB data accuracy by store-type and location have significant implications for research design and data sourcing considerations. Food environment researchers in rural settings can pursue complementary strategies for data collection and triangulation^{(Reference Sharkey15,Reference Caspi and Friebur47–Reference Fleischhacker, Evenson and Sharkey49)} , and government data may provide a better point of departure^{(Reference Daepp and Black14)}. Further, although CAB data may be useful in urban settings, our data suggest that smaller population centres may still suffer from bias; the urban utility of CAB data may only extend to grocery stores in large population centres.

Limitations

Due to the potential impact of sample size on rates of diagnostic errors, there is potential for stratification to have affected diagnostic outcomes^{(Reference Lebel, Daepp and Block10)}. Additionally, the government dataset we used as a gold-standard is secondary administrative data; frequency of site visits is tailored to level of food safety risk and in our jurisdiction, food inspection data are ideally reviewed and collected in 3-year cycles^{(Reference Mah, Pomeroy and Knox37)}. The collection and ground-truthing of these data are constrained by associated limitations in public health service budgets. To address this issue, partnerships with government throughout the research were used to address data access and accuracy.

Conclusion

The current research is the first province-wide data validation analysis in Atlantic Canada and is potentially generalisable to other heterogeneous sub-national jurisdictions (regions, states, territories). The use of geographic scales that align with government administrative regions has significance for policymaking. Our findings suggest that CAB data are less accurate in rural regions and may identify and classify grocery stores with higher accuracy than convenience stores and gas stations. Researchers should evaluate multiple data collection strategies at the community level and partner with local institutions when possible. It is crucial to recognise that dataset errors may in fact be a function of policy-relevant service considerations. The current research emphasises the impact of systematic error in CAB data for researchers working in rural sampling frames or implementing an analytic strategy encompassing retailers not classified as grocers.

Acknowledgements

Acknowledgements: The Dalhousie GIS Centre provided access to the commercial data. Thanks to the Centre of Geographic Sciences for support throughout the project. The NL Statistics Agency and Service NL assisted with access to and coding of administrative data. Financial support: This work was supported in part by Health Canada’s Office of Nutrition Policy and Promotion (MOA #4500327812 to C.L.M.); the Canadian Institutes of Health Research (FRN PG1-144782 to C.L.M.); the Canada Research Chairs program (to C.L.M.) and the Nova Scotia Graduate Scholarship program (to N.G.A.T.). Conflict of interest: None to declare. Authorship: N.G.A.T. co-led the data processing, analysis and interpretation and led manuscript preparation; J.S. led data processing and analysis; C.M. formulated the research question and study design and led implementation, interpretation, co-led manuscript preparation and supervised all components of the project. Ethics of human subject participation: This research did not involve human participants. This project used institutionally and commercially available retailer data, and this work did not require the approval of a research ethics board.

References

Soriano, JB, Abajobir, AA, Abate, KHet al. (2017) Global, regional, and national deaths, prevalence, disability-adjusted life years, and years lived with disability for chronic obstructive pulmonary disease and asthma, 1990–2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet Respir Med 5, 691–706.CrossRef Google Scholar

Afshin, A, John Sur, P, Fay, KAet al. (2019) Health effects of dietary risks in 195 countries, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet 393, 1958–1972. doi: 10.1016/S0140-6736(19)30041-8.CrossRef Google Scholar

Kleinert, S & Horton, R (2015) Rethinking and reframing obesity. Lancet 385, 2326–2328. doi: 10.1016/S0140-6736(15)60163-5.CrossRef Google Scholar PubMed

Giskes, K, van Lenthe, F, Avendano-Pabon, Met al. (2011) A systematic review of environmental factors and obesogenic dietary intakes among adults: are we getting closer to understanding obesogenic environments? Obes Rev 12, e95–e106.CrossRef Google Scholar PubMed

Peeters, A (2018) Obesity and the future of food policies that promote healthy diets. Nat Rev Endocrinol 14, 430–437.CrossRef Google Scholar PubMed

Glanz, K, Sallis, JF, Saelens, BEet al. (2005) Healthy Nutrition Environments: Concepts and Measures. https://journals.sagepub.com/doi/pdf/10.4278/0890-1171-19.5.330 (accessed May 2019).CrossRef Google Scholar

McKinnon, RA, Reedy, J, Morrissette, MAet al. (2009) Measures of the food environment. A compilation of the literature, 1990-2007. Am J Prev Med 36, S124–S133. doi: 10.1016/j.amepre.2009.01.012.CrossRef Google Scholar PubMed

Lytle, LA (2009) Measuring the food environment. State of the science. Am J Prev Med 36, S134–S144.CrossRef Google Scholar PubMed

Lytle, LA & Sokol, RL (2017) Measures of the food environment: a systematic review of the field, 2007–2015. Health Place 44, 18–34.CrossRef Google Scholar PubMed

Lebel, A, Daepp, MIG, Block, JPet al. (2017) Quantifying the foodscape: a systematic review and meta-analysis of the validity of commercially available business data. PLoS One 12, e0174417. doi: 10.1371/journal.pone.0174417.CrossRef Google Scholar PubMed

Fleischhacker, SE, Evenson, KR, Sharkey, Jet al. (2013) Validity of secondary retail food outlet data: a systematic review. Am J Prev Med 45, 462–473.CrossRef Google Scholar PubMed

Trevethan, R (2017) Sensitivity, specificity, and predictive values: foundations, pliabilities, and pitfalls in research and practice. Front Public Heal 5, 307.CrossRef Google Scholar PubMed

Clary, CM & Kestens, Y (2013) Field validation of secondary data sources: a novel measure of representativity applied to a Canadian food outlet database. Int J Behav Nutr Phys Act 10, 77. doi: 10.1186/1479-5868-10-77.CrossRef Google Scholar PubMed

Daepp, MI & Black, J (2017) Assessing the validity of commercial and municipal food environment data sets in Vancouver, Canada. Public Health Nutr 20, 2649–2659.CrossRef Google Scholar PubMed

Sharkey, JR (2009) Measuring potential access to food stores and food-service places in rural areas in the U.S. Am J Prev Med 36, S151–S155.CrossRef Google Scholar PubMed

Sharkey, JR & Horel, S (2008) Neighborhood socioeconomic deprivation and minority composition are associated with better potential spatial access to the ground-truthed food environment in a large rural area. J Nutr 138, 620–627.CrossRef Google Scholar

Liese, AD, Colabianchi, N, Lamichhane, APet al. (2010) Validation of 3 food outlet databases: completeness and geospatial accuracy in rural and urban food environments. Am J Epidemiol 172, 1324–1333.CrossRef Google Scholar PubMed

Longacre, MR, Primack, BA, Owens, PMet al. (2011) Public directory data sources do not accurately characterize the food environment in two predominantly rural states. J Am Diet Assoc 111, 577–582.CrossRef Google Scholar

McGuirt, JT, Jilcott, SB, Vu, MBet al. (2011) Conducting community audits to evaluate community resources for healthful lifestyle behaviors: an illustration from rural eastern North Carolina. Prev Chronic Dis 8, A149.Google Scholar PubMed

Lake, AA, Burgoine, T, Stamp, Eet al. (2012) The foodscape: classification and field validation of secondary data sources across urban/rural and socio-economic classifications in England. Int J Behav Nutr Phys Act 9, 37.CrossRef Google Scholar PubMed

Burgoine, T & Harrison, F (2013) Comparing the accuracy of two secondary food environment data sources in the UK across socio-economic and urban/rural divides. Int J Health Geogr 12, 2. doi: 10.1186/1476-072X-12-2.CrossRef Google Scholar PubMed

Wilkins, EL, Radley, D, Morris, MAet al. (2017) Examining the validity and utility of two secondary sources of food environment data against street audits in England. Nutr J 16, 82. doi: 10.1186/s12937-017-0302-1.CrossRef Google Scholar PubMed

Minaker, LM, Shuh, A, Olstad, DLet al. (2016) Retail food environments research in Canada: a scoping review. Can J Public Heal 107, eS4–eS13.CrossRef Google Scholar PubMed

Bruner, MW, Lawson, J, Pickett, Wet al. (2008) Rural Canadian adolescents are more likely to be obese compared with urban adolescents. Int J Pediatr Obes 3, 205–211.CrossRef Google Scholar PubMed

Pong, RW, DesMeules, M & Lagacé, C (2009) Rural–urban disparities in health: how does Canada fare and how does Canada compare with Australia? Aust J Rural Health 17, 58–64.CrossRef Google Scholar PubMed

Penney, TL, Rainham, DGC, Dummer, TJBet al. (2014) A spatial analysis of community level overweight and obesity. J Hum Nutr Diet 27, 65–74.CrossRef Google Scholar PubMed

Shearer, C, Blanchard, C, Kirk, Set al. (2012) Physical activity and nutrition among youth in rural, suburban and urban neighbourhood types. Can J Public Heal 103, S55–S60.CrossRef Google Scholar

Canadian Cancer Statistics 2017 (Canadian Cancer Society’s Advisory Committee on Cancer Statistics), Toronto, Ontario. http://www.cancer.ca/~/media/cancer.ca/CW/cancerinformation/cancer101/Canadiancancerstatistics/Canadian-Cancer-Statistics-2017-EN.pdf (accessed April 2019).Google Scholar

Table: 13-10-0113-01 (formerly CANSIM 105-0509), Canadian Health Characteristics, Two Year Period Estimates, by Age Group and Sex, Canada, Provinces, Territories and Health Regions (Source: Canadian Community Health Survey). Statistics Canada. https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1310011301 (accessed April 2019).Google Scholar

Table: 13-10-0794-01 (formerly CANSIM 105-2023), Measured Adult Body Mass Index (BMI) (World Health Organization Classification), by Age Group and Sex, Canada and Provinces (Source: Canadian Community Health Survey (CCHS) – Nutrition, 2004 and 2015). Statistics Canada. https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1310079401 (accessed April 2019).Google Scholar

Garriguet, D (2008) Beverage consumption of children and teens. Heal Rep 19, 17–22.Google Scholar PubMed

Garriguet, D (2007) Canadians’ eating habits. Heal Rep 18, 17–32.Google Scholar PubMed

Black, JL & Billette, J-M (2013) Do Canadians meet Canada’s Food Guide’s recommendations for fruits and vegetables? Appl Physiol Nutr Metab 38, 234–242.CrossRef Google Scholar PubMed

Table: 13-10-0096-01 (formerly CANSIM 105-0508), Canadian Health Characteristics, Annual Estimates, by Age Group and Sex, Canada (Excluding Territories) and Provinces (Source: Canadian Community Health Survey). Statistics Canada. https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1310009601 (accessed April 2019).Google Scholar

Canada’s Health and Nutrition Atlas (Website, Archived). Statistics Canada. https://www.canada.ca/en/health-canada/services/food-nutrition/food-nutrition-surveillance/canada-nutrition-atlas/maps-indicator.html#nu (accessed April 2019).Google Scholar

Population Centre and Rural Area Classification (2016) https://www.statcan.gc.ca/eng/subjects/standard/pcrac/2016/introduction (accessed August 2018).Google Scholar

Mah, CL, Pomeroy, S, Knox, Bet al. (2018) An assessment of the rural consumer food environment in Newfoundland and Labrador, Canada. J Hunger Environ Nutr 14, 490–510. https://doi.org/10.1080/19320248.2018.1465000.CrossRef Google Scholar

Cummins, S, Smith, DM, Aitken, Zet al. (2010) Neighbourhood deprivation and the price and availability of fruit and vegetables in Scotland. J Hum Nutr Diet 23, 494–501.CrossRef Google Scholar PubMed

Wrigley, N, Coe, NM & Currah, A (2005) Globalizing retail: conceptualizing the distribution-based transnational corporation (TNC). Prog Hum Geogr 29, 437–457.CrossRef Google Scholar

Landis, JR & Koch, GG (1977) The measurement of observer agreement for categorical data. Biometrics 33, 159.CrossRef Google Scholar PubMed

Ma, X, Battersby, SE, Bell, BAet al. (2013) Variation in low food access areas due to data source inaccuracies. Appl Geogr 45, 131–137. doi: 10.1016/j.apgeog.2013.08.014.CrossRef Google Scholar

Canada’s Convenience and Fuel Retail Channel Annual Facts & Figures Report (Canadian Convenience Stores Association) Oakville, 2017. https://depquebec.net/wp-content/uploads/2017/09/CCSA-2017-Annual-Facts-Figures-Report.pdf (accessed March 2020).Google Scholar

Khan, S, Pinault, L, Tjepkema, Met al. (2018) Health Reports Positional accuracy of geocoding from residential postal codes versus full street addresses. www.statcan.gc.ca (accessed August 2019).Google Scholar

Whitsel, EA, Rose, KM, Wood, JLet al. (2004) Accuracy and repeatability of commercial geocoding. Am J Epidemiol 160, 1023–1029.CrossRef Google Scholar PubMed

McDonald, YJ, Schwind, M, Goldberg, DWet al. (2017) An analysis of the process and results of manual geocode correction. Geospat Health 12, 526.CrossRef Google Scholar PubMed

Faure, E, Danjou, AMN, Clavel-Chapelon, Fet al. (2017) Accuracy of two geocoding methods for geographic information system-based exposure assessment in epidemiological studies. Environ Heal 16, 15.CrossRef Google Scholar PubMed

Caspi, CE & Friebur, R (2016) Modified ground-truthing: an accurate and cost-effective food environment validation method for town and rural areas. Int J Behav Nutr Phys Act 13, 37. doi: 10.1186/s12966-016-0360-3.CrossRef Google Scholar PubMed

Jones, KK, Zenk, SN, Tarlov, Eet al. (2017) A step-by-step approach to improve data quality when using commercial business lists to characterize retail food environments. BMC Res Notes 10, 35.CrossRef Google Scholar PubMed

Fleischhacker, SE, Evenson, KR, Sharkey, Jet al. (2013) Validity of secondary retail food outlet data: a systematic review. Am J Prev Med 45, 462–473.CrossRef Google Scholar PubMed

Table 1 Classification and geospatial error of true-positive stores when comparing a provincial dataset with the DMTI enhanced points of interest in Newfoundland and Labrador, Canada (n 380)

Table 2 Sensitivity, positive predictive value and concordance between a government dataset and the DMTI enhanced points of interest in Newfoundland and Labrador, Canada (n 1124)

Article contents

Validation of a province-wide commercial food store dataset in a heterogeneous predominantly rural food environment

Abstract

Keywords

Methods

Setting

Data

Matching algorithm

Analysis

Results

Discussion

Limitations

Conclusion

Acknowledgements

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests