Skip to main content Accessibility help

Handling missing data in an FFQ: multiple imputation and nutrient intake estimates

  • Mari Ichikawa (a1), Akihiro Hosono (a1), Yuya Tamai (a1), Miki Watanabe (a1), Kiyoshi Shibata (a1) (a2), Shoko Tsujimura (a1), Kyoko Oka (a1), Hitomi Fujita (a1), Naoko Okamoto (a1), Mayumi Kamiya (a1), Fumi Kondo (a1), Ryozo Wakabayashi (a1), Taiji Noguchi (a1), Tatsuya Isomura (a1) (a3) (a4), Nahomi Imaeda (a1) (a5), Chiho Goto (a1) (a6), Tamaki Yamada (a7) and Sadao Suzuki (a1)...



We aimed to examine missing data in FFQ and to assess the effects on estimating dietary intake by comparing between multiple imputation and zero imputation.


We used data from the Okazaki Japan Multi-Institutional Collaborative Cohort (J-MICC) study. A self-administered questionnaire including an FFQ was implemented at baseline (FFQ1) and 5-year follow-up (FFQ2). Missing values in FFQ2 were replaced by corresponding FFQ1 values, multiple imputation and zero imputation.


A methodological sub-study of the Okazaki J-MICC study.


Of a total of 7585 men and women aged 35–79 years at baseline, we analysed data for 5120 participants who answered all items in FFQ1 and at least 50% of items in FFQ2.


Among 5120 participants, the proportion of missing data was 3·7%. The increasing number of missing food items in FFQ2 varied with personal characteristics. Missing food items not eaten often in FFQ2 were likely to represent zero intake in FFQ1. Most food items showed that the observed proportion of zero intake was likely to be similar to the probability that the missing value is zero intake. Compared with FFQ1 values, multiple imputation had smaller differences of total energy and nutrient estimates, except for alcohol, than zero imputation.


Our results indicate that missing values due to zero intake, namely missing not at random, in FFQ can be predicted reasonably well from observed data. Multiple imputation performed better than zero imputation for most nutrients and may be applied to FFQ data when missing is low.


Corresponding author

*Corresponding author: Email


Hide All
1. Lamb, KE, Olstad, DL, Nguyen, C et al. (2017) Missing data in FFQs: making assumptions about item non-response. Public Health Nutr 20, 965970.
2. Ahn, Y, Paik, HY & Ahn, YO (2006) Item non-responses in mailed food frequency questionnaires in a Korean male cancer cohort study. Asia Pac J Clin Nutr 15, 170177.
3. Hansson, LM & Galanti, MR (2000) Diet-associated risks of disease and self-reported food consumption: how shall we treat partial nonresponse in a food frequency questionnaire? Nutr Cancer 36, 16.
4. Kuskowska-Wolk, A, Holte, S, Ohlander, EM et al. (1992) Effects of different designs and extension of a food frequency questionnaire on response rate, completeness of data and food frequency responses. Int J Epidemiol 21, 11441150.
5. Parr, CL, Hjartaker, A, Scheel, I et al. (2008) Comparing methods for handling missing values in food-frequency questionnaires and proposing k nearest neighbours imputation: effects on dietary intake in the Norwegian Women and Cancer study (NOWAC). Public Health Nutr 11, 361370.
6. Fraser, GE, Yan, R, Butler, TL et al. (2009) Missing data in a long food frequency questionnaire: are imputed zeroes correct? Epidemiology 20, 289294.
7. Michels, KB & Willett, WC (2009) Self-administered semiquantitative food frequency questionnaires: patterns, predictors, and interpretation of omitted items. Epidemiology 20, 295301.
8. Schafer, JL & Graham, JW (2002) Missing data: our view of the state of the art. Psychol Methods 7, 147177.
9. Enders, CK (2010) Applied Missing Data Analysis. New York: Guilford Press.
10. Rubin, DB (1996) Multiple imputation after 18+ years. J Am Stat Assoc 91, 473489.
11. Fraser, G & Yan, R (2007) Guided multiple imputation of missing data: using a subsample to strengthen the missing-at-random assumption. Epidemiology 18, 246252.
12. Klebanoff, MA & Cole, SR (2008) Use of multiple imputation in the epidemiologic literature. Am J Epidemiol 168, 355357.
13. Ware, JH, Harrington, D, Hunter, DJ et al. (2012) Missing data. N Engl J Med 367, 13531354.
14. Barzi, F, Woodward, M, Marfisi, RM et al. (2006) Analysis of the benefits of a Mediterranean diet in the GISSI-Prevenzione study: a case study in imputation of missing values from repeated measurements. Eur J Epidemiol 21, 1524.
15. Rizzo, NS, Sabate, J, Jaceldo-Siegl, K et al. (2011) Vegetarian dietary patterns are associated with a lower risk of metabolic syndrome: the Adventist Health Study 2. Diabetes Care 34, 12251227.
16. Hamajima, N & J-MICC Study Group (2007) The Japan Multi-Institutional Collaborative Cohort Study (J-MICC Study) to detect gene–environment interactions for cancer. Asian Pac J Cancer Prev 8, 317323.
17. Tokudome, S, Goto, C, Imaeda, N et al. (2004) Development of a data-based short food frequency questionnaire for assessing nutrient intake by middle-aged Japanese. Asian Pac J Cancer Prev 5, 4043.
18. Tokudome, Y, Goto, C, Imaeda, N et al. (2005) Relative validity of a short food frequency questionnaire for assessing nutrient intake versus three-day weighed diet records in middle-aged Japanese. J Epidemiol 15, 135145.
19. Goto, C, Tokudome, Y, Imaeda, N et al. (2006) Validation study of fatty acid consumption assessed with a short food frequency questionnaire against plasma concentration in middle-aged Japanese people. Scand J Food Nutr 50, 7782.
20. Imaeda, N, Goto, C, Tokudome, Y et al. (2007) Reproducibility of a short food frequency questionnaire for Japanese general population. J Epidemiol 17, 100107.
21. van Buuren, S (2007) Multiple imputation of discrete and continuous data by fully conditional specification. Stat Methods Med Res 16, 219242.
22. Van Buuren, S & Oudshoorn, C (2000) Multivariate Imputation by Chained Equations: MICE V1.0 User’s Manual. TNO Report no. PG/VGZ/00.038. Leiden: TNO Prevention and Health.
23. Enders, CK (2010) Applied Missing Data Analysis, pp. 212213. New York: Guilford Press.
24. Otsuka, R, Kato, Y, Nishita, Y et al. (2016) Age-related changes in energy intake and weight in community-dwelling middle-aged and elderly Japanese. J Nutr Health Aging 20, 383390.
25. Wakimoto, P & Block, G (2001) Dietary intake, dietary patterns, and changes with age: an epidemiological perspective. J Gerontol A Biol Sci Med Sci 56, Spec. No. 2, 6580.
26. Gottschall, AC, West, SG & Enders, CK (2012) A Comparison of item-level and scale-level multiple imputation for questionnaire batteries. Multivariate Behav Res 47, 125.
27. Plumpton, CO, Morris, T, Hughes, DA et al. (2016) Multiple imputation of multiple multi-item scales when a full imputation model is infeasible. BMC Res Notes 9, 45.
28. Caan, B, Hiatt, RA & Owen, AM (1991) Mailed dietary surveys: response rates, error rates, and the effect of omitted food items on nutrient values. Epidemiology 2, 430436.
29. White, IR, Royston, P & Wood, AM (2011) Multiple imputation using chained equations: issues and guidance for practice. Stat Med 30, 377399.


Type Description Title
Supplementary materials

Ichikawa et al. supplementary material
Tables S1 and S2

 Word (51 KB)
51 KB


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed