In Korea, mortality due to cancer has been among the highest causes of death since 1983(1) and has been increasing steadily(2). The medical cost associated with each cancer case is higher than that of other types of diseases, and a significant proportion of cancer incidence is associated with dietary and lifestyle risk factors. Consequently, cancer research has become critically important for the whole nation.
Doll & Peto(Reference Doll and Peto3) estimated that changes in dietary habits alone could reduce cancer incidence by a third. The World Cancer Research Fund/American Institute for Cancer Research reported that 30–40 % of carcinogenesis could be prevented by optimal nutrition, regular physical activity and the prevention of obesity(4). In 2007, ten dietary and physical activity recommendations for cancer prevention were established(5).
Cancer and dietary risk factors have extensively been studied with various diet survey tools. A multiethnic cohort study in Hawaii and Los Angeles(Reference Kolonel, Henderson and Hankin6) used a newly developed quantitative FFQ considering traditional dishes for each ethnic group(Reference Stram, Hankin and Wilkens7). The FFQ was calibrated for each ethnic and sex group. In the Women's Health Initiative, the relationship between low-fat diets and colorectal and breast cancer(Reference Chlebowski, Pettinger and Stefanick8, 9) was investigated by a semi-quantitative FFQ including 122 food items(Reference Patterson, Kristal and Tinker10) and the minority population FFQ was used in the Women's Health Trial Feasibility Study(Reference Kristal, Shattuck and Patterson11). In the European Prospective Investigation into Cancer and Nutrition project, dietary survey data and nutrient databases of each country, in cohorts of ten European nations, were integrated into one dataset for an accurate dietary survey for the entire study. Then, FFQ appropriate for the dietary culture of each country were developed, validated and calibrated with each country's research results(Reference Deharveng, Charrondiere and Slimani12–Reference Voss, Charrondiere and Slimani16). The Japan Collaborative Cohort Study used an FFQ consisting of thirty-five food items and a questionnaire that assessed the intake of alcohol and the preference for fatty and salty foods(Reference Iso, Date and Noda17), with portion size determined by a validation study(Reference Date, Fukui and Yamamoto18).
Recently, Korean cohorts have been established to study the relationship between diet and cancer(Reference Shin, Lim and Sung19, Reference Shin, Lim and Sung20). Several small-scale surveys or laboratory studies have reported that traditional Korean diets are cancer-protective as a result of their emphasis on allium-containing vegetables such as garlic(Reference Kim, Shin and Ahn21, Reference Park, Kim and Suh22). However, Korean diets have several undesirable attributes such as high salt and a high level of barbecued meat consumption. Furthermore, Korean cooking is characterised by various similar ingredients prepared by different approaches and mixed with many similar seasonings. Because cancer-related dietary factors (CRDF) are relevant to culture-specific cooking methods and ingredients, focusing on the consumption of prepared dishes, instead of food ingredients, was deemed appropriate for Korean diet-related cancer research. A few Korean FFQ for studying chronic diseases have been developed(Reference Ahn, Lee and Cho23, Reference Kim, Yun and Kim24), but no FFQ for cancer research is currently available. In the present study, we aimed to develop a dish-based, semi-quantitative FFQ for diet and cancer research using a database approach.
Selection of cancer-related dietary factors
Of the CRDF that have been widely accepted in the scientific community(5), sixteen factors that are extractable from the Korean National Health and Nutrition Examination Survey (KNHANES) were selected for the present study. Specifically, those factors varied from the ‘convincing’ to the ‘limited suggestive’ categories from the World Cancer Research Fund/American Institute for Cancer Research's report(5). Several suggested factors, such as β-carotene supplements, lycopene, folate, Se and vitamin D, were not included in the present study because these data were unavailable in the Korean food composition database.
Both allium-containing vegetables (sum of vegetables such as garlic, onion, leeks, scallions and chives) and garlic are noted separately as cancer-protective factors in the World Cancer Research Fund/American Institute for Cancer Research publication. Garlic, but not allium-containing vegetables, was included in developing the present instrument. When all dishes consumed by the subjects in the KNHANES were included in multivariate regression for variability analysis, the accumulated correlation coefficient (r 2) value of allium-containing vegetables did not reach 0·9. This indicates that if all of the dishes eaten by the subjects in the KNHANES are included in the final FFQ, allium vegetable intake cannot adequately explain between-person variations. This phenomenon occurs because allium vegetables are added in very small amounts as flavourings in almost every Korean dish.
Sugar-containing foods, a dietary risk factor of colorectal cancer(5), were not included in the present study. Sugar-containing foods in the Korean diet include snacks, cookies, cakes, breads, coffee from vending machines, ice cream and yogurt, in addition to added sugar and syrup in prepared foods. However, there was no publicly released database of sugar content in these dishes; therefore, sugar content could not be calculated. Subsequently, the sixteen CRDF selected for the present study were β-carotene, Ca, vitamin C, Na, retinol, red meat, alcohol, processed meats, fruits, vegetables (including garlic and carrot), garlic, carrots, milk, dairy food (including milk), beans and fish.
Determination of the dish list and portion size
Using a database approach, we determined the dish list and portion sizes using large-scale Korean dietary intake survey datasets: the KNHANES (2001) and Korean National Nutrition Survey by Season (2002), which are the most current seasonal survey datasets available in Korea. Both surveys utilised one 24 h recall. The purpose of the Korean National Nutrition Survey by Season was to assess the variations in dietary intake in terms of dishes, foods and nutrients among the four seasons. Subjects between the two surveys were intentionally overlapped. Subjects over 30 years old in the 2002 Korean National Nutrition Survey by Season numbered 1644, 1665 and 1558 people for the spring, summer and autumn surveys, respectively, with a mean of 1623 people. Therefore, 1623 people were selected randomly from a total of 5722 people of the same age group in the 2001 KNHANES winter survey. Those who participated in the survey over two seasons were regarded as different persons. Our working dataset was proved to be representative in terms of nutrient intake of the full dataset of the 2001 KNHANES using Student's t test (data not shown).
The total number of subjects (Table 1) was 6490, and this value was the total number of subjects for all the four seasons. These subjects had a mean age of 49 years, with men and women accounting for 46·1 and 53·9 %, respectively, and the highest frequency age group was 30–49 years of age (57·9 %), followed by 50–64 years of age (25·6 %), and an over 65-year-old group (16·5 %).
Of the 993 dishes eaten by the subjects, the final dish list was selected from the results of the contribution analysis and from the results of the variability analysis of the effects of between-person variation (Fig. 1). Both analyses were performed for the intake of CRDF and non-CRDF nutrients, i.e. energy, carbohydrate, protein, fat, etc., which were also included for nutritional assessment purposes.
First, we tried to select the number of dishes to ensure that the total contribution to each CRDF and non-CRDF nutrient would be more than 90 %. To fulfil these criteria, we had to choose over 400 dishes; therefore, we chose dishes collectively contributing more than 50 % to each risk factor used in the final study. Variability analyses were then performed using multiple regression analyses to select dishes that explained between-person variations in the intake of two or more CRDF. All dishes eaten by subjects (993 dishes) were included in one model as independent variables for each risk factor, and independent variables were selected by a stepwise method for each model. Through the processes, dishes with over 90 % accumulated square of r 2 for each CRDF and non-CRDF nutrient were selected and then merged with those over 50 % accumulated sum of the contribution to reach a manageable number of dishes. The resulting 181 dishes were then regrouped by similarity in main ingredients and/or serving unit, and four alcoholic beverages, fruits and yogurts were added to reach the final 112 dishes. For example, dishes such as ‘soyabean paste soup, plain’, ‘soyabean paste soup, chard’, ‘soyabean paste soup, Chinese cabbage’, ‘soyabean paste soup, spinach’, ‘soyabean paste soup, radish leaves’, ‘soyabean paste soup, mugwort’ and ‘soyabean paste soup, mallow’ were grouped into one dish item, ‘soyabean paste soup’, as presented in Table 3.
Determination of frequency response formats
The frequency response formats of consumption were classified into four types: nine categories for rice, soups, stews and side dishes (i.e. never to three times/d); nine categories for beverages (i.e. never to over six times/d); eight categories for fruits (i.e. never to over four times/d)(25); eight categories for alcoholic beverages (i.e. never to twice/d). The estimated intake of CRDF and non-CRDF nutrients was then converted to the total daily intake according to the frequency and serving size categories.
Decision of portion sizes
The portion sizes in the present study were determined based on the reported amount consumed in the KNHANES data. The portion sizes of small, medium and large for each dish represented the amount consumed in g by the KNHANES subjects at the 25th, 50th and 75th percentiles, respectively. A few exceptions were those dishes with a super central tendency with no or little variation in serving sizes among the population. The examples are steamed rice, rice with mixed grains, fruit juices and fried eggs, and their portion sizes corresponded to the 10th, 50th and 90th percentiles for small, medium and large portion sizes, respectively. In actual surveys, photographs of the representative dishes were presented on the questionnaire to differentiate the three portion sizes.
Development of a nutrient database of the 112 dish items
For each of the 112 dish items on the questionnaire, a nutrient database (per medium portion) was created from the analytical data from the 2001 KNHANES and the 2002 Korean National Nutrition Survey by Season, the recipes of each dish and the frequency weights. The final 112 dish items were made from 3260 ingredients. The ingredients appearing at a frequency of less than 1 % were removed from each dish after an examination of the recipes. Consequently, the number of ingredients found in the 112 dish items was reduced to 1769. The nutrient composition of each dish item was calculated according to the median value of the nutrients of each dish. Groups of dishes were assigned weights of intake frequency from the analysis data. The nutrient database for alcoholic beverages was added after calculations of alcohols using the nutrient database from the appendix of the seventh Korean Recommended Dietary Allowance(26), with the measured weight of one unit for each type of alcoholic beverage as purchased from markets.
The rankings of dishes with respect to the CRDF in the contribution analysis and the variability analysis are summarised in Table 2. The most frequently selected dish in the contribution analysis was ‘kimchi stew’, which contributed significantly to the intake of a total of eleven CRDF: red meat, processed meat, vegetables, garlic, beans, fish, Na, β-carotene, Ca, vitamin C and retinol. The next dish was ‘soyabean paste soup’, which contributed to a total of ten CRDF. The most frequently selected dish in the variability analysis was ‘soyabean paste stew’, which contributed significantly to the intake of a total of eight CRDF: vegetables, garlic, beans, fish, Na, Ca, β-carotene and vitamin C. The next dish items were ‘rolled rice’, ‘soyabean paste soup with radish leaves’ and ‘kimchi stew’, which contributed to a total of six CRDF. In these rankings, seven different types of kimchi and five different types of soyabean paste soup were included, indicating that these two items are important contributors for both CRDF and non-CRDF nutrients in the Korean diet.
* Total number of CRDF is sixteen.
These 112 dish items consisted of nine types of staple foods, including rice and noodles, twenty-five types of soups and stews, fifty-four types of side dishes, nine types of beverages, nine types of fruits and six types of alcoholic beverages. A database of dish items was constructed, considering the median serving size of the actual distribution and frequency weight of each dish. An example of the frequency weights for ‘soyabean paste soup’ is shown in Table 3. The database was assigned a weighted mean value, which multiplied the intake frequency with the nutrients and food content of each dish. The energy content of ‘soyabean paste soup, Chinese cabbage’ showed 129·6 kJ/61 g (medium portion) as the lowest, whereas the energy content of ‘soyabean paste soup, plain’ showed 183·9 kJ/61 g as the highest. However, the frequency weight of ‘soyabean paste soup, plain’ was higher (0·30) than ‘soyabean paste soup, Chinese cabbage’ (0·16). Therefore, the weighted mean energy of the dish item ‘soyabean paste soup’ was calculated as 167·2 kJ/61 g, which was close to the energy of ‘soyabean paste soup, plain’. Weights of vegetables and garlic have variation and ranges from 6·2 to 71·6 and from 0 to 1·2 g, respectively; however, their weighted means were calculated as 36·7 and 0·6 g, respectively. The protein content of the ‘soyabean paste soup’ was also calculated by the frequency weights. The magnification values for portion sizes of ‘soyabean paste soup’ were 0·7 for ‘small’ and 1·48 for ‘large’ compared with the median weight of the ‘soyabean paste soup’ (the magnification value of median portion size is always ‘1’). The magnification values for the three portion sizes of all dishes ranged from 0·3 to 0·98 for the ‘small’ portion and from 1·1 to 6 for the ‘large’ portion. They were greater than 2 in the ‘large’ portion for ‘chicken gruel’, ‘steamed seafood with sprout/fried seafood’, ‘roasted duck’ and others, and less than 0·5 for the ‘small’ portion of ‘fried pepper’, ‘flatfish sushi’, ‘kale’ and others (data not shown).
* Nutrient database was calculated per medium portion size of dish.
† Small, medium and large portion sizes represent 25th, 50th and 75th percentiles, respectively, of the weighted portion sizes consumed by the subjects.
‡ Nutrient composition of the dish item = Σ(frequency weight × nutrient composition)/number of variations.
In Table 4, the number of dishes selected for each CRDF and non-CRDF nutrient and the values of the percentage coverage of CRDF and non-CRDF nutrient intake according to the dish items in the FFQ are presented. The number of dishes selected ranged from 1 (milk) to 103 (Ca) in CRDF and from 48 (vitamin A) to 203 (P) in non-CRDF nutrients. The percentage coverage for energy was 82·4 %, whereas the values for protein, fat and carbohydrates were 76·4, 68·9 and 86·0 %, respectively. In minerals, the percentage coverage of Fe was the lowest at 73·9 %. In vitamins, retinol was the lowest at 67·2 %, and vitamin C was the highest at 84·1 %. In foods included in the CRDF, alcoholic beverages were the highest at 99·8 % and fish was the lowest at 56·7 %.
* The number of total dishes in the FFQ is not the sum of dishes from the CA and VA because of overlap. Four alcoholic beverages and two yogurts, not selected from the CA or VA, were added for alcohol and dairy food.
Development of this dish-based, semi-quantitative FFQ to evaluate Korean intake of CRDF and non-CRDF nutrients is timely and significant. In using a database approach, the dish items of this newly developed FFQ were identified by using the most current report of the World Cancer Research Fund/American Institute for Cancer Research (2007) and the large-scale, nationally representative KNHANES dataset. This approach might be similar to the development of the Block 2005 FFQ that used dietary data from the National Health and Nutrition Examination Survey 1999–2002(Reference Block27) and of the Diet History Questionnaire of the National Cancer Institute that used the representative national survey data of the continuing survey of food intake by individuals(Reference Subar28). These collections are representative FFQ that were formulated using a database approach. Block et al. (Reference Block, Hartman and Dresser29) reported that a database approach improves an FFQ because it takes advantages of the large-scale national survey data of the representative population. The present study is the first attempt to develop Korean diet FFQ for cancer research using a database approach based on the national dietary survey data. The resulting dietary assessment instrument could be used to facilitate diet-related cancer cohort studies in Korea.
A nutrient composition database for each dish item was made using its recipe from the KNHANES rather than using standard recipes, which do not exist for all dishes selected. The KNHANES data were considered suitable for recipes because the data were from a large population that represented the population of Korea. In many studies in Korea(Reference Ji, Kim and Choi30–Reference Yim, Lee and Park34), nutrient databases were formulated using various recipes instead of data pertaining to the particular food or dish. The Block Adult Questionnaire and the National Cancer Institute Diet History Questionnaire also use databases made up of the mean or median values calculated from the analysed data in the creation of a dish and food list(Reference McNutt, Zimmerman and Hull35).
The three portion sizes in our FFQ were determined from the rank orders of serving sizes reported by the Korean population in the KNHANES: small, medium and large, representing the 25th, 50th and 75th percentiles, respectively. This approach mirrors the dietary practices of our target population, in contrast to the commonly used arbitrary estimation of 0·5 and 1·5 times the medium size being the small and large serving sizes, respectively(Reference Won and Kim33, Reference Yim, Lee and Park34, Reference Kim, Suh and Nam36). Similar approaches were taken in determining the portion sizes of the Block Adult Questionnaire and the National Cancer Institute Diet History Questionnaire by utilising percentile cut-offs(Reference McNutt, Zimmerman and Hull35).
The number of dishes selected from analyses for each CRDF and non-CRDF nutrient, and the percentage coverage of CRDF and non-CRDF nutrients by the 112 dish items in the FFQ, had large levels of variation (Table 4). In the CRDF, only one dish (plain milk) was selected by the contribution analysis and variability analysis for milk and for Ca, eighteen dishes were selected by the contribution analysis and 102 dishes were selected by the variability analysis, for a total of 103 dishes selected for Ca. In spite of that, the percentage coverage of milk was 95·6 %, whereas that of Ca was 78·8 %. This means that plain milk could explain milk intake on its own, but 103 dishes could not explain every part of Ca intake. The percentage coverage of alcoholic beverages, fruits, milk and dairy food exceeded 90 %, whereas that of fish was less than 60 %, partly because the selected fish dishes (n 79) could not represent the diverse fish dishes consumed by Koreans in the KNHANES (n 256). The percentage coverage of the six alcoholic beverages in our list approached 100 % because four more frequently consumed alcoholic beverages were added, even though only two dishes were selected in each analysis for alcohol. Among cancer-related vitamins, the percentage coverages of retinol, β-carotene and vitamin C were 67·2, 82·6 and 84·1 %, respectively. This is probably due to the incomplete nature of the KNHANES nutrient database for retinol. Nine dishes were selected for retinol because the KNHANES nutrient database for retinol has many missing values. The percentage coverage among macronutrients was highest for carbohydrates with 86 % and lowest for fat with 68·9 %. Kim et al. (Reference Kim, Kim and Ahn37) also reported that the percentage coverage of fat was lowest among the macronutrients in their study of the Korean diet, that of Ca was 80 %, which is similar to the results of the present study, and that of Na was 61·6 %, which is much lower than that in the present results. In the study of Block et al. (Reference Block, Hartman and Dresser29), the percentage coverage of the FFQ exceeded 90 % for all nutrients compared with the dietary survey data of the National Health and Nutrition Examination Survey. The differences in the percentage coverage of macronutrients may be explained by the differences in the FFQ and also by the differences in the study population, i.e. Koreans in the study of Kim et al. (Reference Kim, Kim and Ahn37) v. Americans in the study of Block et al. (Reference Block, Hartman and Dresser29). The percentage coverage of Fe in these three studies was low, possibly because of a large between-person variation as reported in the study of Bingham(Reference Bingham38). In the study of Shahar et al. (Reference Shahar, Fraser and Shai39) in Israel, the number of foods explaining 80 % of the between-person variability based on the regression model (variability analysis in the present study) for selected nutrients were all under forty. Dishes selected in the variability analysis of the present study explain 90 % of the between-person variability, but over 100 dishes explained 80 % of the between-person variability for some nutrients in the present study. In another study, Shahar et al. (Reference Shahar, Shai and Vardi40) compared the number of foods explaining 90 % of the between-person variability among people who were born in different countries. The number of selected foods for nutrients differs considerably among these groups. This means that the number of foods or dishes explaining the between-person variability is different among populations that have a different dietary culture, and we need more dishes to explain enough between-person variability in the Korean population.
In general, if a small number of dishes represent diverse dishes that contribute to the selected CRDF, the percentage coverage is low. Conversely, if a small number of Korean dishes contribute to a selected dietary factor (such as plain milk for milk and dairy products), and that dish is selected for the final dish list, the percentage coverage becomes high. This underscores the importance of selection and exclusion processes for the final dish list in developing dish-based FFQ. Equally important is the development of cultural-specific FFQ that are developed from representative dietary survey data that reflect new trends of dietary lifestyles(Reference Cade, Thompson and Burley41).
The present study used the 2001–2 KNHANES dataset, which represents the sole and latest collection of data with seasonal dietary intake in Korea. The dataset had only 1 d of dietary intake data collection, which was extrapolated to estimate the usual intake. However, future surveys are planned to include 2 d for dietary surveys. As a result of the present study, this newly developed dish-based, semi-quantitative FFQ that calculates the intake of CRDF in the Korean diet is available for future cancer and nutrition research.
The present work was supported by the National Cancer Center in the Republic of Korea (grant no. 0720660). We thank Jae Eun Shim, PhD, for helpful suggestions during developing the FFQ and manuscript revisions. M. K. P. designed the study, performed the statistical analysis, interpreted the data and wrote the manuscript. D. W. K. designed the FFQ checklist and performed the statistical analysis. J. K., S. P. and H. J. designed the study, contributed to the statistical analysis and data interpretation, and revised the manuscript. W. O. S. contributed to the data interpretation and revised the manuscript. H. Y. P. conceived of the study, participated in its design and coordination, and revised the manuscript critically for important intellectual content. All authors read and approved the final version of the manuscript. All authors declare no personal or financial interest in the content of the paper.