Skip to main content Accessibility help
×
Home

Information:

  • Access
  • Cited by 9

Figures:

Actions:

      • Send article to Kindle

        To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

        Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

        Find out more about the Kindle Personal Document Service.

        Prediction of fruit and vegetable intake from biomarkers using individual participant data of diet-controlled intervention studies
        Available formats
        ×

        Send article to Dropbox

        To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

        Prediction of fruit and vegetable intake from biomarkers using individual participant data of diet-controlled intervention studies
        Available formats
        ×

        Send article to Google Drive

        To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

        Prediction of fruit and vegetable intake from biomarkers using individual participant data of diet-controlled intervention studies
        Available formats
        ×
Export citation

Abstract

Fruit and vegetable consumption produces changes in several biomarkers in blood. The present study aimed to examine the dose–response curve between fruit and vegetable consumption and carotenoid (α-carotene, β-carotene, β-cryptoxanthin, lycopene, lutein and zeaxanthin), folate and vitamin C concentrations. Furthermore, a prediction model of fruit and vegetable intake based on these biomarkers and subject characteristics (i.e. age, sex, BMI and smoking status) was established. Data from twelve diet-controlled intervention studies were obtained to develop a prediction model for fruit and vegetable intake (including and excluding fruit and vegetable juices). The study population in the present individual participant data meta-analysis consisted of 526 men and women. Carotenoid, folate and vitamin C concentrations showed a positive relationship with fruit and vegetable intake. Measures of performance for the prediction model were calculated using cross-validation. For the prediction model of fruit, vegetable and juice intake, the root mean squared error (RMSE) was 258·0 g, the correlation between observed and predicted intake was 0·78 and the mean difference between observed and predicted intake was − 1·7 g (limits of agreement: − 466·3, 462·8 g). For the prediction of fruit and vegetable intake (excluding juices), the RMSE was 201·1 g, the correlation was 0·65 and the mean bias was 2·4 g (limits of agreement: − 368·2, 373·0 g). The prediction models which include the biomarkers and subject characteristics may be used to estimate average intake at the group level and to investigate the ranking of individuals with regard to their intake of fruit and vegetables when validating questionnaires that measure intake.

A high consumption of fruit and vegetables has been associated with a reduced risk of several chronic diseases, including cancer and CVD( 1 3 ). Therefore, intervention studies that aim to increase the consumption of fruit and vegetables using advice or counselling are often conducted. To investigate the success of an intervention, the subjects are asked to report or recall their consumption of fruits and vegetables. However, because it is highly likely that the subject is aware of the intervention (i.e. the advice or counselling), the report or recall is likely to be biased. Objective measures, such as measuring subjects' serum/plasma concentrations of carotenoids, have been used to investigate whether an intervention led to an increase in fruit and vegetable consumption as compared to the control group( 4 6 ), but these biomarkers do not quantify the increase in fruit and vegetable intake caused by the intervention.

The validation of fruit and vegetable intake currently relies on self-reporting instruments. However, self-reported dietary intake instruments are found to be biased and to have correlated errors in comparison to recovery biomarkers, such as doubly labelled water and urinary N excretion( 7 10 ). Therefore, if we were able to quantify fruit and vegetable intake based on biomarkers rather than on self-reporting, the comparison of self-reported intake with this biomarker-based intake estimate would provide us with a better idea of true validity. No recovery biomarker is available for fruit and vegetable intake. Therefore, it would be useful to find a predictive biomarker that can be related to the true intake of fruits and vegetables( 11 , 12 ).

It is not accurate to relate, for instance, an increase in β-carotene concentration with an exact increase in fruit and vegetable consumption. Single biomarkers and the sum of carotenoids have previously been shown to have low correlations with self-reported intakes of fruits and vegetables( 13 21 ). Therefore, in order to ascertain the full range of fruit and vegetable intake, it is worthwhile to investigate whether a combination of biomarkers, possibly in combination with other factors, provides more reliable results. Baldrick et al. ( 22 ) found that the carotenoids and vitamin C are the most consistently responsive biomarkers for fruit and vegetable intake. In addition, serum/plasma folate may be used as a biomarker of fruit and vegetable intake, even though it is a less sensitive marker, especially in countries where fortification with folate is mandatory( 23 , 24 ). In order to be able to use biomarkers to quantify the consumption of fruits and vegetables, the dose–response relationship between fruit and vegetable intake and the respective biomarkers must be present. Because dietary intake recorded by subjects is often biased, a cross-sectional study with such data will not provide us with an unbiased estimate of the dose–response curve. In contrast, for diet-controlled intervention studies in which fruits and vegetables are provided to the participants, the intake data does not rely solely on self-reporting. In these studies, the combination of information about the amounts provided, information from supervised consumption and self-reported information on compliance may lead to a less biased estimate of fruit and vegetable intake. We therefore conducted an individual participant data meta-analysis of such studies, covering a wide range of fruit and vegetable intakes. The first aim of the present study was to investigate the dose–response curve between fruit and vegetable consumption and multiple biomarkers, namely, serum carotenoids (α-carotene, β-carotene, β-cryptoxanthin, lycopene, lutein and zeaxanthin), serum/plasma folate and serum/plasma vitamin C. The second aim was to establish a prediction model of fruit and vegetable intake based on these biomarkers which may be used as a predictive biomarker or to estimate group-level intake.

Methods

Search strategy

The aim of the literature search was to find diet-controlled intervention studies (i.e. food provision studies or partly supervised feeding studies) conducted with adult subjects in which reports on the amount of consumed fruits and vegetables were supported by information on the amounts provided and in which significant efforts were made to maximise compliance. The following diet-controlled intervention studies were included: (1) studies in which all foods and drinks were provided to the subjects during the intervention, and (2) studies in which all fruits and vegetables consumed were provided to the subjects. In addition, carotenoids or folate concentrations in the blood after intervention were measured, and papers were published in the English language. The search was conducted in Scopus, in Pubmed and by a manual search of reference lists. Search terms in the titles and abstracts included ‘fruit’ and ‘vegetables’ combined with ‘intervention’, ‘trial’ and ‘feeding study’. These terms were then combined with ‘biomarkers’, ‘biological markers’, ‘carotenoids’, ‘α-carotene’, ‘beta-carotene’, ‘beta-cryptoxanthin’, ‘zeaxanthin’, ‘lycopene’, ‘lutein’, ‘folate’ and ‘bioavailability’. The search included studies published before October 2012.

Papers were first screened based on their titles and abstracts. Then, the full text of the papers that were considered potentially relevant were read and judged for relevancy. Next, the full text of the papers was retrieved and judged using inclusion and exclusion criteria. The exclusion criteria were: (1) intervention studies in which the intervention consisted of dietary advice or counselling (and therefore foods were not provided to the subjects by the investigators); (2) intervention studies in which not all fruits and vegetables were provided (i.e. the provision consisted of additional fruits and vegetables on top of normal fruit and vegetable consumption) or in which fruits and vegetables were provided as supplements (e.g. capsules), juices or extracts; (3) intervention studies in which the intervention involved a single ingestion of the intervention food(s) or an intervention period of 6 d or fewer; and (4) studies that were conducted in children, adolescents, institutionalised elderly or pregnant or lactating women.

Data

The current contact details of each study's corresponding author, first author or other authors were searched on the Internet. Authors were contacted by email and asked whether they were willing to send the original data of the study. These authors were offered a co-authorship on the present paper. We requested individual participant data (where available) of subject characteristics (sex, age, height, weight (or BMI) and smoking status), serum/plasma values of biomarkers and intake of fruits and vegetables (or intervention group coding).

In addition, we collected information on: (1) the study design (whether it was a parallel or crossover study, whether a run-in period was included and, where applicable, whether a wash-out period was included); (2) the dietary intervention (the duration of the dietary intervention and the daily intake of fruits and vegetables, carotenoids or folate); and (3) the serum/plasma measurements (whether blood was drawn after a fasting period and which methods were used for sample analysis).

Statistical analysis

Outliers, which were defined as all observations above (Q3+4 × IQR) (where Q3 refers to the third quartile and IQR refers to the interquartile range), were removed from the dataset. The median number of outliers per biomarker was 1 (range: 0–7).

Dose–response curves

The dose–response curve between log-transformed biomarker concentrations (dependent variable) and fruit and vegetable intake (independent variable) and between biomarker concentrations and the corresponding micronutrient was estimated using fractional polynomials( 25 , 26 ). To account for the one crossover study and for between-study heterogeneity, the final parameter estimates were calculated using mixed models with study and subjects as random effects. Therefore, the estimated variance components refer to differences between studies, differences between individuals (to account for the crossover study) and residual variance.

To obtain predictions on the original scale rather than on the logarithmic scale, we applied the following back-transformation:

$$\begin{eqnarray} E ( Y ) = exp\left ( \beta _{0} + { \sum _{ k = 1}^{ p } }\, \beta _{ k } X _{ k } + \frac {1}{2} \sigma ^{2}\right ), \end{eqnarray}$$

where Y is the biomarker concentration on the original scale, E(Y) is the expectation of Y, X is the fruit and vegetable intake, β refers to the regression coefficients of the dose–response model and σ2 is the sum of the variance components estimated in the mixed model.

Several covariates were tested to see whether they statistically significantly predicted the biomarker concentrations. Covariates that were tested included age, BMI, sex and smoking. In addition, the interaction between fruit and vegetable intake and these covariates was tested. The covariates and interactions were tested by including them one at a time in separate fractional polynomial regression models.

Prediction models of fruit and vegetable intake

We developed three different prediction models based on what we learned from the dose–response curves. The models were estimated using linear regression: (1) a pre-specified model in which all continuous variables were added as linear terms, (2) a pre-specified model in which the shape of all continuous variables was established using multivariable fractional polynomials (MFP; referred to as the MFP model), and (3) a reduced model that included only the statistically significant predictors which were selected using MFP (referred to as the reduced MFP model). The MFP models were analysed using STATA/SE version 11.0 for Windows. Interactions between the subject characteristics (age, BMI, sex and smoking status) and the biomarkers (α-carotene, β-carotene, lutein+zeaxanthin, lycopene and β-cryptoxanthin) were tested for inclusion in the model in four separate models (including (1) main effects+age × biomarkers; (2) main effects+BMI × biomarkers; (3) main effects+sex × biomarkers; and (4) main effects+smoking status × biomarkers). All interactions were included as linear terms. Interactions with P< 0·05 were considered relevant for inclusion in the prediction model. These interactions were then tested together in the model, and a backward selection was applied until all interactions included in the model had a P value of < 0·05.

Because data on predictors and outcomes were not complete, we used a multiple imputation approach in which ten multiple imputed datasets were created. The power and selection of the predictors was established in all ten imputed datasets separately, and the final model was established by majority voting( 27 ).

The validation of the fruit, vegetable and juice intake (FVJ) and fruit and vegetable intake (excluding juices; FV) prediction models was assessed using tenfold cross-validation. First, the data was imputed as it was earlier, and then the data was randomly separated into ten parts. One part was left out to construct the training set (i.e. the remaining nine parts), and the prediction models were fitted to each of the imputed datasets using linear regression models. The regression coefficients were combined using normal procedures to obtain the regression coefficients for the test data. The out-of-sample data (the test set) was used to calculate the predicted values for each individual by multiplying the regression coefficients and the observed values of the predictors in each of the imputed test sets. The final predicted values were calculated by averaging the predicted values over the ten imputed test sets. Each of the parts was left out once, so the procedure was repeated ten times. These predicted values were compared to the observed values as an estimate of the model performance using three different measures: (1) the root mean squared error (RMSE) =  $$\sqrt {1/ n \sum ( Y - \circ {> Y })^{2}} $$ , (2) the correlation between observed intake and predicted intake, and (3) the mean difference (observed intake minus predicted intake) with the corresponding limits of agreement at the individual level (i.e. mean difference ± 1·96 × SDdifference). Unless otherwise indicated, all analyses were performed using SAS version 9.2 (SAS Institute, Inc.).

Results

Search and data retrieval

A total of 1002 studies were found of which twenty-seven qualified for inclusion in the present meta-analysis( 28 54 ). Of these twenty-seven papers, eight publications described a study population that was also involved in another publication. Therefore, the authors of a total of nineteen unique diet-controlled intervention studies were contacted for cooperation in retrieving individual data. The flowchart of the selection of studies is shown in Fig. 1. A total of twelve authors responded positively to the request and made their data available for the present analysis. A summary of study characteristics of these studies is given in Table 1, and an overview of the data of these studies is presented in Tables 2 and 3. The data of four studies were unfortunately unavailable, and three authors did not respond to our request. Information from these studies is available in online supplementary Table SA.

Fig. 1 Flow diagram of study selection process.

Table 1 Overview of study characteristics of included studies

F&V, fruit and vegetables; FV, fruit and vegetable intake, excluding juices; FVJ, fruit, vegetable and juice intake; FBV, fruit, berries and vegetables.

* The number of individuals used in the present analysis. In brackets, the number of individuals reported in the original publication. For several studies, specific intervention groups were not useful in the present analysis( 36 , 38 , 41 , 49 , 50 , 52 ), and for one study( 44 ), data of a subset of participants was received.

In brackets, indication of whether the amount of fruits and vegetables reported in the table and used in the analysis was the amount provided to the subjects (indicated by ‘P’) or whether the amount relied partly on self-reporting (indicated by ‘R’).

The folate data of that study were no longer available( 34 ).

Table 2 Baseline characteristics of the included studies (Mean values and standard deviations)

* These data are taken from the original publication, but they were not available for the present analysis.

Table 3 Baseline characteristics of the included studies (Mean values and standard deviations)

For six studies, specific groups were not useful in the present analysis( 36 , 38 , 41 , 49 , 50 , 52 ), and for one study( 44 ), data of a subset of participants was received. For the study by Miller et al. ( 44 ), intake of fruits and vegetables in serves was converted to g/d by multiplying the number of serves by 80 g. For the study by Itsiopoulos et al. ( 40 ), intake of fruits and vegetables was known for fifteen subjects. For the remaining twelve subjects, vegetable intake was imputed as the mean of the intake reported in the paper (i.e. 466 g/d vegetables and 162 g/d fruits). Where necessary, α-carotene, β-carotene and lycopene were converted from μg/ml to μmol/l.

Dose–response analysis

The estimated dose–response curves between the different biomarkers and FVJ are shown in Fig. 2, and the dose–response curves between the biomarkers and FV are shown in Fig. 3. All biomarkers show a positive dose–response relationship with fruit and vegetable intake. The regression equations that were obtained are shown in online supplementary Table SB.

Fig. 2 Dose–response curves between serum carotenoids ((a) α-carotene, (b) β-carotene, (c) lutein, (d) zeaxanthin, (e) β-cryptoxanthin, (f) lycopene), (g) plasma/serum folate and (h) vitamin C and fruit, vegetable and juice intake. The ○ indicate the individual data points, and their sizes are proportional to the number of individuals for each specific intake (i.e. the larger the circle, the more individuals were available for analysis).

Fig. 3 Dose–response curves between serum carotenoids ((a) α-carotene, (b) β-carotene, (c) lutein, (d) zeaxanthin, (e) β-cryptoxanthin, (f) lycopene), (g) plasma/serum folate and (h) vitamin C and fruit and vegetable intake (excluding juices). The ○ indicate the individual data points, and their sizes are proportional to the number of individuals for each specific intake (i.e. the larger the circle, the more individuals were available for analysis).

The P values of the covariate and interaction analyses are shown in online supplementary Table SC. Age and smoking were significant predictors for all carotenoids but not for plasma folate. BMI was a significant predictor for α-carotene, β-carotene, lutein, β-cryptoxanthin and lycopene. Sex was only a significant predictor for lutein, zeaxanthin and lutein+zeaxanthin. The interactions between these covariates and the intake of fruits and vegetables were relevant (P< 0·1) in most instances. The smoking × fruit and vegetable interaction was only a significant predictor for about half of the biomarkers, but this may be a result of the relatively low number of smokers included in the present sample.

Where possible, the dose–response relationship between the biomarkers and the intake of the micronutrient was also investigated (online supplementary Fig. SA). The available sample size was largest for β-carotene (n 316) and smallest for lutein+zeaxanthin (n 35). The sample size of zeaxanthin was too low to warrant analysis. All curves showed a positive relationship between intake and serum or plasma concentrations except lutein at high intakes. There is no biological evidence for the drop that is visible in the lutein curve. Because there were very few data available for lutein intake of more than 15 mg/d, this part of the curve is not considered reliable.

Prediction model

The regression coefficients of the final prediction model are presented in Table 4, and the performance measures are presented in Table 5. The power and variable selection process of the MFP and the reduced MFP model is shown in online supplementary Tables SD and SE. For FVJ, the reduced MFP model showed the lowest RMSE (i.e. 258·0 g) and the highest correlation between observed and predicted (i.e. 0·78) as compared to the linear model and the full pre-specified MFP model. The mean difference of the reduced MFP model ( − 1·7 g) was slightly higher than those of the other two models (linear model: − 1·6 g; MFP model: − 1·5 g), but the limits of agreement were markedly smaller than those of the other two models. Bland–Altman plots are presented in online supplementary Fig. SB.

Table 4 The predictors on the multiple completed* datasets (n 492 in each completed dataset) from a linear regression analysis (Regression coefficients, standard errors, and powers)

FVJ, fruit, vegetable and juice intake; FV, fruit and vegetable intake, excluding juices.

* Completed datasets refers to the data after multiple imputation.

The study of Chopra et al. ( 37 ) could not be used in the present analysis because of an estimation problem.

Folate is scaled as folate/10.

§ Age is scaled as age/10.

Table 5 Performance measures of the different prediction models as calculated by cross-validation

FVJ, fruit, vegetable and juice intake; FV, fruit and vegetable intake, excluding juices; RMSE, root mean squared error; MFP, multivariable fractional polynomials.

For FV, the MFP model was the best model. It showed the lowest RMSE (201·1 g), the highest correlation (0·65) and the lowest mean bias (2·4 g) with the smallest limits of agreement ( − 368·2, 373·0 g).

The prediction model for FV showed a somewhat lower correlation and a higher absolute mean difference than the model for fruit and vegetable intake including juices. Therefore, we investigated whether a model including a predictor variable that represented juice intake (in g/d) would improve the prediction for fruit and vegetable intake when juices were excluded. However, this did not markedly change the results. The MFP model including juice as a predictor variable had an RMSE of 202·8 g, a correlation of 0·64, a mean bias of 0·2 g (limits of agreement: − 374·1, 374·6 g). Therefore, the simpler model without juice as a predictor variable is preferred as a prediction model for FV.

In order to compare the performance of the prediction model with the current practice of using the sum of carotenoids or any single biomarker, we calculated the correlation coefficients between the observed intakes and the sum of carotenoids and those between observed intakes and single biomarkers (Table 6). For FVJ, the correlations ranged between 0·04 and 0·32, which was much lower than the 0·65 in the prediction model. Also for FV, the correlations (between 0·15 and 0·38) were lower than that in the prediction model (0·64).

Table 6 Pearson correlations between fruit and vegetable intake and biomarkers

FVJ, fruit, vegetable and juice intake; FV, fruit and vegetable intake, excluding juices.

To indicate the value of the prediction model for individual studies, an additional cross-validation was performed by leaving one entire study out of the training set. The study that was left out comprised the test set. Table 7 shows the RMSE and mean difference with the limits of agreement for the reduced MFP model for FVJ and the MFP model for FV. These show that there is a difference between how well the prediction models perform in each study. The study by Karlsen et al. ( 41 ) shows a worse performance for FVJ but not for FV. This is most likely caused by the relatively high intake of fruits, vegetables and juices in that study (see Table 1).

Table 7 Performance measures of the best-performing prediction models per study as calculated by cross-validation

FVJ, fruit, vegetable and juice intake; MFP, multivariable fractional polynomials; FV, fruit and vegetable intake, excluding juices; RMSE, root mean squared error.

Discussion

The first part of the present research showed that all investigated biomarkers (carotenoids and folate) had a positive relationship with fruit and vegetable intake, and they are therefore useful for predicting fruit and vegetable intake. Several covariates were significantly associated with the biomarkers. The next aim was to develop a prediction model for fruit and vegetable intake based on objective variables, such as biomarkers and subject characteristics. Among the three models for predicting FVJ that were investigated, the reduced MFP model showed the best performance in cross-validation, and the MFP model showed the best performance for FV.

The sum of carotenoids has been used in an attempt to combine biomarkers into a single estimate for fruit and vegetable intake in various studies. The sum of carotenoids was positively correlated with self-reported fruit and vegetable intake( 14 21 , 55 , 56 ). In the present study, the correlations between the predicted values, which can easily be calculated in future research by multiplying observed values from biomarkers and subject characteristics with the corresponding β coefficients from Table 4 and then adding these together, and the observed fruit and vegetable intake (both including and excluding juices) was markedly higher than the correlations between the observed intakes and the sum of carotenoids or any of the single biomarkers. Despite the model's good performance on average, there was some residual variation as well as an overestimation of low fruit and vegetable intake and an underestimation of high fruit and vegetable intake. Not all fruits and vegetables contain the same concentration of carotenoids and folate, and other foods in the diet also contain these nutrients. Therefore, the type of fruits and vegetables eaten and the diet as a whole influence the final biomarker concentrations in the blood. The present study tried to capture ‘normal’ diet effects as much as possible by excluding those studies that provided only a single type of fruit or vegetable and by including intervention arms that focused on carotenoid-rich or folate-rich and carotenoid-poor or folate-poor fruits and vegetables. In order to obtain the large-sample benefits of a meta-analysis, these different study types were grouped together. This was done because a number of studies were included, so we assumed that the applied regression analysis would average out the effects of individual studies and that at least the first approximation would not depend on the types of fruits and vegetables included. Obviously, the assumption is not true in an absolute sense, seeing as carrots, for example, contain more carotenoids than some other vegetables, and this will thus require further investigation.

Another source of variability may come from the different intervention durations. We excluded studies with a duration of less than 7 d because we assumed that it would take approximately 1 week to obtain a new steady state for the carotenoids after the change in diet was induced by the intervention( 57 ). The actual duration of the studies included in the prediction models was much longer (Table 1).

Differences in the analytical methods used in the different studies may be another source of residual variation. In particular, folate levels were analysed using different assays, e.g. immunoassay and radioassay. Also, laboratory variability may be caused by different specimen collection and storage( 58 ) techniques, among many other possible sources.

Sex, age, BMI and smoking impact on serum carotenoids, serum vitamin C and plasma folate levels as well as several other covariates, such as serum cholesterol, serum TAG and the consumption of alcohol, fat and energy, may also be related to the biomarkers( 59 63 ). It may be of interest to investigate whether these covariates could significantly improve the prediction model. However, the present data did not allow us to investigate this thoroughly.

Although significant efforts were made in all individual studies to encourage compliance to the study protocol (e.g. the supervised consumption of meals; see Table 1), the true intake of fruits and vegetables could not always be determined with absolute certainty because they relied on self-reports of compliance. In quite a number of the individual studies, compliance was investigated with, e.g., questionnaires or diaries, and most often this self-reported compliance was high.

Unfortunately, no external validation data was available for the prediction model. We chose to use all of the data from the diet-controlled intervention studies that were available to us to develop the models. To perform an external validation, data from other or new diet-controlled intervention studies would have to be obtained. Because this would be very complicated and because the data from such studies would preferably be used to develop or improve the present model rather than to just validate it, we mimicked independent data by using cross-validation to calculate the measures of performance( 64 ).

The use of individual participant data from diet-controlled intervention studies made it possible to model the dose–response curves and the prediction models for a large range of fruit and vegetable intake with a relatively large number of subjects using a more objective assessment of intake. However, between-study differences may have influenced the study results. In the dose–response analysis, we took clustering into account by using mixed-effects models( 65 ). For the prediction model, the marginal predictions (i.e. using only the fixed effects because the (unknown) random effects cannot be used in predictions for new subjects) from the random intercept linear regression model performed somewhat worse in cross-validation than the predictions from the standard regression model (data not shown), and we therefore chose to present the standard regression model. Bouwmeester et al. ( 66 ) found similar performance measures for a standard logistic regression model and a random intercept logistic regression model in a study on surgical patients that were clustered by anaesthesiologist. Recently, Debray et al. ( 67 ) developed an approach to deal with risk prediction in new patients that takes into account the random intercept after the model has been developed using individual participant data meta-analysis with mixed-effects modelling. In the present study, the performance of the conditional predictions was not considerably better than the performance of the standard predictions in an apparent validation (i.e. an internal validation based on the entire data, not using cross-validation) (data not shown).

In conclusion, the relatively strong correlations between predictions and actual intake indicate that the present prediction models may be used to investigate the ranking of individuals with regard to their intake of fruits and vegetables when validating questionnaires that measure intake (e.g. FFQ or 24 h recall). Furthermore, the low mean bias show that the models have good potential to be used to estimate average fruit and vegetable intake on a group level. The large limits of agreement indicate that the prediction models should not be used to estimate individual fruit and vegetable intake.

Supplementary material

To view supplementary material for the present article, please visit http://dx.doi.org/10.1017/S0007114515000355

Acknowledgements

The present research was financially supported by ZonMW (project number 200400014). ZonMW had no role in the design, analysis or writing of the present article.

The authors declare that there is no conflict of interest.

The authors' responsibilities were as follows: H. C. B. designed the research; R. F., B. W., A. B., E. R. M., J. J. M. C., W. J. P., K. v. d. H., M. C., A. K., L. O. D., R. W., C. I., L. B., K. O., C. A. v. L.-B. and T. H. J. N. provided essential data that was used for the present study; J. H. M. d. V. and H. v. d. V. provided essential advice; O. W. S. performed the statistical analysis; O. W. S. and H. C. B. wrote the paper; O. W. S. and H. C. B. had primary responsibility for final content. All authors read and approved the final manuscript.

References

1 Boeing, H, Bechthold, A, Bub, A, et al. (2012) Critical review: vegetables and fruit in the prevention of chronic diseases. Eur J Nutr 51, 637663.
2 Hung, HC, Joshipura, KJ, Jiang, R, et al. (2004) Fruit and vegetable intake and risk of major chronic disease. J Natl Cancer Inst 96, 15771584.
3 Riboli, E & Norat, T (2003) Epidemiologic evidence of the protective effect of fruit and vegetables on cancer risk. Am J Clin Nutr 78, Suppl. 3, 559S569S.
4 Macdonald, HM, Hardcastle, AC, Duthie, GG, et al. (2009) Changes in vitamin biomarkers during a 2-year intervention trial involving increased fruit and vegetable consumption by free-living volunteers. Br J Nutr 102, 14771486.
5 Newman, VA, Flatt, SW & Pierce, JP (2008) Telephone counseling promotes dietary change in healthy adults: results of a pilot trial. J Am Diet Assoc 108, 13501354.
6 Rock, CL, Moskowitz, A, Huizar, B, et al. (2001) High vegetable and fruit diet intervention in premenopausal women with cervical intraepithelial neoplasia. J Am Diet Assoc 101, 11671174.
7 Day, N, McKeown, N, Wong, M, et al. (2001) Epidemiological assessment of diet: a comparison of a 7-day diary with a food frequency questionnaire using urinary markers of nitrogen, potassium and sodium. Int J Epidemiol 30, 309317.
8 Kipnis, V, Midthune, D, Freedman, L, et al. (2002) Bias in dietary-report instruments and its implications for nutritional epidemiology. Public Health Nutr 5, 915923.
9 Kipnis, V, Midthune, D, Freedman, LS, et al. (2001) Empirical evidence of correlated biases in dietary assessment instruments and its implications. Am J Epidemiol 153, 394403.
10 Kipnis, V, Subar, AF, Midthune, D, et al. (2003) Structure of dietary measurement error: results of the OPEN biomarker study. Am J Epidemiol 158, 1421, discussion 22–16.
11 Tasevska, N, Midthune, D, Potischman, N, et al. (2011) Use of the predictive sugars biomarker to evaluate self-reported total sugars intake in the Observing Protein and Energy Nutrition (OPEN) study. Cancer Epidemiol Biomarkers Prev 20, 490500.
12 Tasevska, N, Runswick, SA, McTaggart, A, et al. (2005) Urinary sucrose and fructose as biomarkers for sugar consumption. Cancer Epidemiol Biomarkers Prev 14, 12871294.
13 Andersen, LF, Veierod, MB, Johansson, L, et al. (2005) Evaluation of three dietary assessment methods and serum biomarkers as measures of fruit and vegetable intake, using the method of triads. Br J Nutr 93, 519527.
14 Bogers, RP, Dagnelie, PC, Westerterp, KR, et al. (2003) Using a correction factor to correct for overreporting in a food-frequency questionnaire does not improve biomarker-assessed validity of estimates for fruit and vegetable consumption. J Nutr 133, 12131219.
15 Bogers, RP, Van Assema, P, Kester, AD, et al. (2004) Reproducibility, validity, and responsiveness to change of a short questionnaire for measuring fruit and vegetable intake. Am J Epidemiol 159, 900909.
16 Brantsaeter, AL, Haugen, M, Rasmussen, SE, et al. (2007) Urine flavonoids and plasma carotenoids in the validation of fruit, vegetable and tea intake during pregnancy in the Norwegian Mother and Child Cohort Study (MoBa). Public Health Nutr 10, 838847.
17 Carlsen, MH, Karlsen, A, Lillegaard, IT, et al. (2011) Relative validity of fruit and vegetable intake estimated from an FFQ, using carotenoid and flavonoid biomarkers and the method of triads. Br J Nutr 105, 15301538.
18 Jansen, MC, Van Kappel, AL, Ocke, MC, et al. (2004) Plasma carotenoid levels in Dutch men and women, and the relation with vegetable and fruit consumption. Eur J Clin Nutr 58, 13861395.
19 Jilcott, SB, Keyserling, TC, Samuel-Hodge, CD, et al. (2007) Validation of a brief dietary assessment to guide counseling for cardiovascular disease risk reduction in an underserved population. J Am Diet Assoc 107, 246255.
20 Resnicow, K, Odom, E, Wang, T, et al. (2000) Validation of three food frequency questionnaires and 24-hour recalls with serum carotenoid levels in a sample of African-American adults. Am J Epidemiol 152, 10721080.
21 Toft, U, Kristoffersen, L, Ladelund, S, et al. (2008) Relative validity of a food frequency questionnaire used in the Inter99 study. Eur J Clin Nutr 62, 10381046.
22 Baldrick, FR, Woodside, JV, Elborn, JS, et al. (2011) Biomarkers of fruit and vegetable intake in human intervention studies: a systematic review. Crit Rev Food Sci Nutr 51, 795815.
23 Brevik, A, Vollset, SE, Tell, GS, et al. (2005) Plasma concentration of folate as a biomarker for the intake of fruit and vegetables: the Hordaland Homocysteine Study. Am J Clin Nutr 81, 434439.
24 Willett, WC (2013) Nutritional Epidemiology, 3rd ed. Oxford: Oxford University Press.
25 Royston, P & Altman, DG (1994) Regression using fractional polynomials of continuous covariates – parsimonious parametric modeling. Appl Stat 43, 429467.
26 Sauerbrei, W & Royston, P (1999) Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. J R Stat Soc Ser A Stat Soc 162, 7194.
27 Vergouwe, Y, Royston, P, Moons, KGM, et al. (2010) Development and validation of a prediction model with missing predictor data: a practical approach. J Clin Epidemiol 63, 205214.
28 Appel, LJ, Miller, ER III, Jee, SH, et al. (2000) Effect of dietary patterns on serum homocysteine: results of a randomized, controlled feeding study. Circulation 102, 852857.
29 Bøhn, SK, Myhrstad, MC, Thoresen, M, et al. (2010) Blood cell gene expression associated with cellular stress defense is modulated by antioxidant-rich food in a randomised controlled clinical trial of male smokers. BMC Med 8, 54.
30 Bowen, PE, Garg, V, Stacewicz-Sapuntzakis, M, et al. (1993) Variability of serum carotenoids in response to controlled diets containing six servings of fruits and vegetables per day. Ann N Y Acad Sci 691, 241243.
31 Brevik, A, Andersen, LF, Karlsen, A, et al. (2004) Six carotenoids in plasma used to assess recommended intake of fruits and vegetables in a controlled feeding study. Eur J Clin Nutr 58, 11661173.
32 Briviba, K, Bub, A, Möseneder, J, et al. (2008) No differences in DNA damage and antioxidant capacity between intervention groups of healthy, nonsmoking men receiving 2, 5, or 8 servings/d of vegetables and fruit. Nutr Cancer 60, 164170.
33 Broekmans, WMR, Klöpping-Ketelaars, IAA, Schuurman, CRWC, et al. (2000) Fruits and vegetables increase plasma carotenoids and vitamins and decrease homocysteine in humans. J Nutr 130, 15781583.
34 Brouwer, IA, Van Dusseldorp, M, West, CE, et al. (1999) Dietary folate from vegetables and citrus fruit decreases plasma homocysteine concentrations in humans in a dietary controlled trial. J Nutr 129, 11351139.
35 Castenmiller, JJ, van de Poll, CJ, West, CE, et al. (2000) Bioavailability of folate from processed spinach in humans. Effect of food matrix and interaction with carotenoids. Ann Nutr Metab 44, 163169.
36 Castenmiller, JJ, West, CE, Linssen, JP, et al. (1999) The food matrix of spinach is a limiting factor in determining the bioavailability of β-carotene and to a lesser extent of lutein in humans. J Nutr 129, 349355.
37 Chopra, M, O'Neill, ME, Keogh, N, et al. (2000) Influence of increased fruit and vegetable intake on plasma and lipoprotein carotenoids and LDL oxidation in smokers and nonsmokers. Clin Chem 46, 18181829.
38 Dragsted, LO, Pedersen, A, Hermetter, A, et al. (2004) The 6-a-day study: effects of fruit and vegetables on markers of oxidative stress and antioxidative defense in healthy nonsmokers. Am J Clin Nutr 79, 10601072.
39 Freese, R, Alfthan, G, Jauhiainen, M, et al. (2002) High intakes of vegetables, berries, and apples combined with a high intake of linoleic or oleic acid only slightly affect markers of lipid peroxidation and lipoprotein metabolism in healthy subjects. Am J Clin Nutr 76, 950960.
40 Itsiopoulos, C, Brazionis, L, Kaimakamis, M, et al. (2011) Can the Mediterranean diet lower HbA1c in type 2 diabetes? Results from a randomized cross-over study. Nutr Metab Cardiovasc Dis 21, 740747.
41 Karlsen, A, Svendsen, M, Seljeflot, I, et al. (2011) Compliance, tolerability and safety of two antioxidant-rich diets: a randomised controlled trial in male smokers. Br J Nutr 106, 557571.
42 Martini, MC, Campbell, DR, Gross, MD, et al. (1995) Plasma carotenoids as biomarkers of vegetable intake: The University of Minnesota cancer prevention research unit feeding studies. Cancer Epidemiol Biomarkers Prev 4, 491496.
43 Miller, ER III, Appel, LJ & Risby, TH (1998) Effect of dietary patterns on measures of lipid peroxidation: results from a randomized clinical trial. Circulation 98, 23902395.
44 Miller, ER III, Erlinger, TP, Sacks, FM, et al. (2005) A dietary pattern that lowers oxidative stress increases antibodies to oxidized LDL: results from a randomized controlled feeding study. Atherosclerosis 183, 175182.
45 Misikangas, M, Freese, R, Turpeinen, AM, et al. (2001) High linoleic acid, low vegetable, and high oleic acid, high vegetable diets affect platelet activation similarly in healthy women and men. J Nutr 131, 17001705.
46 Moller, P, Vogel, U, Pedersen, A, et al. (2003) No effect of 600 grams fruit and vegetables per day on oxidative DNA damage and repair in healthy nonsmokers. Cancer Epidemiol Biomarkers Prev 12, 10161022.
47 Silaste, ML, Rantala, M, Alfthan, G, et al. (2003) Plasma homocysteine concentration is decreased by dietary intervention. Br J Nutr 89, 295301.
48 Silaste, ML, Rantala, M, Alfthan, G, et al. (2004) Changes in dietary fat intake alter plasma levels of oxidized, low-density lipoprotein and lipoprotein(a). Arterioscler Thromb Vasc Biol 24, 498503.
49 van het Hof, KH, Brouwer, IA, West, CE, et al. (1999) Bioavailability of lutein from vegetables is 5 times higher than that of β-carotene. Am J Clin Nutr 70, 261268.
50 Van Loo-Bouwman, CA, West, CE, Van Breemen, RB, et al. (2009) Vitamin A equivalency of β-carotene in healthy adults: limitation of the extrinsic dual-isotope dilution technique to measure matrix effect. Br J Nutr 101, 18371845.
51 Watzl, B, Kulling, SE, Möseneder, J, et al. (2005) A 4-wk intervention with high intake of carotenoid-rich vegetables and fruit reduces plasma C-reactive protein in healthy, nonsmoking men. Am J Clin Nutr 82, 10521058.
52 Winkels, RM, Brouwer, IA, Siebelink, E, et al. (2007) Bioavailability of food folates is 80 % of that of folic acid. Am J Clin Nutr 85, 465473.
53 Yeon, JY, Kim, HS & Sung, MK (2012) Diets rich in fruits and vegetables suppress blood biomarkers of metabolic stress in overweight women. Prev Med 54, S109S115.
54 Yeum, KJ, Booth, SL, Sadowski, JA, et al. (1996) Human plasma carotenoid response to the ingestion of controlled diets high in fruits and vegetables. Am J Clin Nutr 64, 594602.
55 Crispim, SP, Geelen, A, Souverein, OW, et al. (2011) Biomarker-based evaluation of two 24-h recalls for comparing usual fish, fruit and vegetable intakes across European centers in the EFCOVAL Study. Eur J Clin Nutr 65, Suppl. 1, S38S47.
56 Kristal, AR, Vizenor, NC, Patterson, RE, et al. (2000) Precision and bias of food frequency-based measures of fruit and vegetable intakes. Cancer Epidemiol Biomarkers Prev 9, 939944.
57 Chopra, M, McLoone, U, O'Neill, M, et al. (1996) Fruit and vegetable supplementation – effect on ex vivo LDL oxidation in humans. In Natural Antioxidants and Food Quality in Atherosclerosis and Cancer Prevention, pp. 150155 [Kumpulainen, JT and Salonen, JT, editors]. London: The Royal Society of Chemistry.
58 Blanck, HM, Bowman, BA, Cooper, GR, et al. (2003) Laboratory issues: use of nutritional biomarkers. J Nutr 133, Suppl. 3, 888S894S.
59 Brady, WE, Mares-Perlman, JA, Bowen, P, et al. (1996) Human serum carotenoid concentrations are related to physiologic and lifestyle factors. J Nutr 126, 129137.
60 Drewnowski, A, Rock, CL, Henderson, SA, et al. (1997) Serum β-carotene and vitamin C as biomarkers of vegetable and fruit intakes in a community-based sample of French adults. Am J Clin Nutr 65, 17961802.
61 Maiani, G, Caston, MJ, Catasta, G, et al. (2009) Carotenoids: actual knowledge on food sources, intakes, stability and bioavailability and their protective role in humans. Mol Nutr Food Res 53, Suppl. 2, S194S218.
62 Tucker, KL, Selhub, J, Wilson, PW, et al. (1996) Dietary intake pattern relates to plasma folate and homocysteine concentrations in the Framingham Heart Study. J Nutr 126, 30253031.
63 van Kappel, AL, Steghens, JP, Zeleniuch-Jacquotte, A, et al. (2001) Serum carotenoids as biomarkers of fruit and vegetable consumption in the New York Women's Health Study. Public Health Nutr 4, 829835.
64 Efron, B (1983) Estimating the error rate of a prediction rule: improvement on cross-validation. J Am Stat Assoc 78, 316331.
65 Abo-Zaid, G, Guo, B, Deeks, JJ, et al. (2013) Individual participant data meta-analyses should not ignore clustering. J Clin Epidemiol 66, 865873, e864.
66 Bouwmeester, W, Twisk, JW, Kappen, TH, et al. (2013) Prediction models for clustered data: comparison of a random intercept and standard regression model. BMC Med Res Methodol 13, 19.
67 Debray, TP, Moons, KG, Ahmed, I, et al. (2013) A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis. Stat Med 32, 31583180.