Genetic dissection of complex traits is one of the major challenges of modern genetics, which must be met with new analytical methods based on powerful genetically informative samples (Zheng et al., Reference Zheng, Ding, Chen and He2013). The Wuhan Twin Birth Cohort (WTBC) is a prospective birth cohort study designed to elucidate the role of genetics and environmental factors on disease etiology (Gatz et al., Reference Gatz, Harris, Kaprio, McGue, Smith, Snieder and Butler2015). Most of the twin studies carried out in China have recruited adolescent and adult twins. However, there have been few studies conducted on twins in China from gestation to adolescence (0–18 year old). Wuhan city is the capital city of Hubei province and the biggest metropolitan city in middle China with a population of 10.61 million (Hubei Provincial Government, n.d.). The WTBC offers a unique opportunity to follow women from pre- to postnatal, as well as follow-up of their twins. This cohort is one of China's richest sets of genetically informative resources.
The WTBC was derived from the Wuhan Pre/post-natal Twin Birth Registry (WPTBR), one of the largest twin birth registries in China. It is also an ongoing population-based registry that will continue to recruit women and their twins residing in Wuhan City. This registry was established from the Wuhan Maternal and Child Health Management Information System (WMCHMIS), which underwent a strict and standardized quality assurance/quality control (QA/QC) procedure to ensure high-quality data. The WMCHMIS includes data collected prospectively on all women from the first trimester of pregnancy to delivery, as well as the children from birth to 7 year old. This system has accrued approximately 100,000 annual births from all maternity units in Wuhan (including urban and rural areas) since its inception in 2003. Currently, the WMCHMIS covers 16 maternal and child health agencies (100%), 98 midwifery institutions (100%), 105 community health service centers (100%), 81 rural hospitals (100%), and 612 kindergartens (80%) in Wuhan.
In 2006, we initiated the WPTBR by recruiting women from their first trimester of twin pregnancy, and then their twins following birth. From January 2006 to May 2016, the total number of twins registered with the WPTBR was 13,869 twin pairs. The WPTBR offers a unique opportunity to follow women from pre- to postnatal stages of their pregnancy, as well as the follow-up of their twins. The follow-up nature of the registry will allow researchers to trace the life history of mothers and twins to disentangle the genetic and environmental influences on health variations and development.
Original funding sources for the registry were primarily from the National Health and Family Planning Commission of the People's Republic of China, United Nations International Children's Emergency Fund (UNICEF), and the Health and Family Planning Commission of Wuhan Municipality. Recently, the WPTBR received funding from the Gates Foundation, the Gates HBGDki project, as well as significant funding for a project entitled ‘Wuhan Twin Birth Cohort (WTBC)’ from the intramural research program of the Wuhan Medical and Healthcare Center for Women and Children (WMHCWC). Ethical approval and DNA data use approval were obtained from Wuhan Medical and Healthcare Center for Women and Children Ethical Review Board.
Major Goals of the WTBC
Our twin birth cohort aims to collect data from 25,000 twin pairs from pregnancy, with the goal of obtaining extensive information on the health and lifestyle of both twins and their mothers, such as physical characteristics, mental health, and behavior. Longitudinal follow-up and surveillance of common diseases are also to be conducted. Additionally, the WTBC aims to study epigenetic markers and genes, both in intrauterine and childhood stages. There is evidence that the intrauterine period has the greatest epigenetic flux in human development (Foley et al., Reference Foley, Craig, Morley, Olsson, Dwyer, Smith and Saffery2009; Saffery et al., Reference Saffery, Morley, Carlin, Joo, Ollikainen, Novakovic and Carson2012). However, few studies have addressed the extent of epigenetic changes that occur thereafter (Saffery et al., Reference Saffery, Morley, Carlin, Joo, Ollikainen, Novakovic and Carson2012). Therefore, the collection of biological specimens from different cell lineages in a longitudinal manner will address this gap in the research. We are focusing on the epigenetic markers at birth and in childhood to provide clues to the causal links between the intrauterine and early life exposures that influence phenotypes, such as birth weight and disease risk in later life. We also aim to use the classical twin model (Keller & Coventry, Reference Keller and Coventry2005) to examine environmental and genetic factors that lead to disease-discordant twins and exposure-discordant twins.
Detailed Description of the Cohort
The WPTBR initiated the WTBC. Thus, data on women and their twins, such as demographic data, as well as information pertaining to delivery, postpartum follow-up, child healthcare, and prenatal medical information from the WPTBR are linked to the WTBC (Table 1). And three different samples of the WTBC were utilized for twin enrollments (Figure 1).
The first sample, which corresponds to a ‘population-based approach’, started on January 1, 2006, and ended on January 1, 2013. This sample relied on linking the WPTBR database that includes 6,920 twin pairs. It covers all pre/post-natal information from WPTBR, but no biological specimens were collected. The eligibility criteria for participants are as follows: (1) expected delivery date should be after January 2006; (2) twin pairs should include live-born infants, stillbirths, and fetal deaths.
The second sample started on January 1, 2013 and will continue to December 2020; it currently comprises a total of 6,949 twin pairs. The collected information includes Wuhan pre/post-natal information from the WPTBR database and neonatal blood spot samples, which is the only difference from the first sample. The biological specimens collected during this time were primarily in the form of dried blood spots of the entire Wuhan twin population, sampled by means of self-administered finger prick. The blood was then stored onto filter cards.
The third sample, which utilizes the ‘hospital-based approach’, started on March 28, 2016, and will end in December 2020. This method consists of voluntary requests for the enrolment of twins at the WMHCWC. One thousand pregnant women and their twin pairs are targeted to be enrolled within this 5-year period. Following ethical approval, these women were recruited from the WMHCWC during their first trimester (0–16 weeks of gestation). Recruiting at this time enabled the measurement of maternal and fetal factors at all three trimesters of pregnancy, which allowed for the evaluation of the relationship between exposure factors in three trimesters for the uncertainty of the specific windows of susceptibility. This method minimized recall bias at baseline and used an anxiety and depression scale, and dietary and sleep questionnaires, to maintain contact with the women. The eligibility criteria for participants are as follows: (1) they should be willing to participate in the cohort study, (2) their gestational ages are less than 16 weeks when they enter the cohort, (3) having a B-mode ultrasound diagnosis of twins at the first prenatal examination, and (4) they must have their prenatal examinations and give birth at the WMHCWC. Those who are not willing to give birth at WMHCWC, even if they visit the WMHCWC for prenatal examinations, are excluded from the study. Currently, the WPTBC has enrolled about 195 mothers and twins through this approach.
For the first and second sample, we recruited all eligible mothers and twins registered with the WPTBR. Medical histories of pregnancies were recorded at least five times (once during the first trimester, twice during the second trimester, and twice during the third trimester), and child healthcare information was recorded at least five times for the first year of life (1 month, 3 months, 6 months, 8 months, and 12 months), twice for the second year of life (1.5 years and 2 years), twice for the third year of life (2.5 years and 3 years), and after that, once a year until 7 year old. At last, at birth, the baby's dried blood samples are obtained.
For the third sample, we will take repeated biological samples until the twins reach 18 years of age. We will also record the health and lifestyle history of twins and their mothers, such as physical characteristics, mental health, and behavior. A further follow-up of twins beyond 18 years of age is currently being planned.
Measures and Zygosity Diagnosis
The main longitudinal data in our study are the repeated medical records from the WPTBR. The WPTBR collects information such as demographic characteristics, medical history, prenatal examinations, deliveries, and postnatal visits for mothers and infants, as well as children's medical information from birth to 7 years of age. The registry is comprised of seven different databases: the prenatal basic information, first prenatal visit, prenatal medical, birth record (delivery information of both mothers and twins), postpartum basic information, postpartum visit, and child health database (Table 1).
The classification of zygosity is crucially important in twin studies. In the majority of birth cohorts, zygosity assessment has been based on the questionnaire method, with specific questions pertaining to the similarity of the twins, to determine zygosity among same-sex twin pairs (Asaka, Reference Asaka and Ookiand2004; Chen et al., Reference Chen, Li, Chen, Yang, Zhang, Duan and Ge2010; Christiansen et al., Reference Christiansen, Frederiksen, Schousboe, Skytthe and Von Wurmbschwark2003; Gao et al., Reference Gao, Li, Cao, Zhan, Lv, Qin and Chen2006; Rietveld et al., Reference Rietveld, van der Valk, Bongers, Stroet, Slagboom and Boomsma2000). This method classifies zygosity with more than 95% accuracy for same-sex twin pairs (Cutler et al., Reference Cutler, Murphy, Hopper, Keogh, Dai and Craig2015). During the past decade, zygosity has increasingly been determined by genetic markers. For the first sample, we are planning to carry out zygosity assessment in the near future through the questionnaire method by telephone. For the second sample, we conducted zygosity assessment using both methods due to the nature of our sample. There are 6,949 twin pairs enrolled. Among these, zygosity information for 2,488 same-sex twin pairs (35.80%) was obtained through the genetic markers method. For the remaining pairs, zygosity information for 1,858 of them (26.74%) will be collected using the questionnaire method, since we did not have blood spots for these pairs, and 2,366 of them (34.05%) were not same-sex twins. We did not have zygosity information for 237 (3.42%) pairs due to death of the twins before zygosity could be assessed. For the third sample, we are planning to carry out zygosity assessment for all enrolled twins using genetic markers in their blood samples. The genetic markers we used included genotyping 19 polymorphic markers (D8S1179, D21S11, D7S820, CSF1PO, D3S1358, TH01, D13S317, D16S539, D2S1338, D19S433, vWA, TPOX, D18S51, D5S818, FGA, D12S391, PentaD, PentaE, D6S1043, and a segment of the x–y homologous gene, Amelogenin). This method has a 99.99% accuracy rating.
For the third sample, we established a biobank connected to the WTBR. Full details with regard to sample and data collection and processing for the biobank are shown in Figure 2. Our trained staff collected the mothers’ blood and urine samples in their first, second, and third trimester of pregnancy when they came to the health centers for their checkup. The fathers’ blood, semen, and urine samples were collected once during the mother's pregnancy period. The trained staff also attended all deliveries in order to collect neonatal tissues (amniotic fluid, cord blood, cord, placenta, and meconium). They are also trained to collect mothers’ stool, vaginal secretions before birth, breast milk at 1 month and 6 months after delivery. At last, we collected stool from the twin pairs at 1 month and 6 months, urine at 6 months and 12 months, and blood at the age of 3 when they entered kindergarten for the admission examination. These steps enable the study of epigenetically diverse tissues from different cell lineages. Furthermore, the collected samples will also allow us to analyze epigenetic markers such as DNA methylation and histone modification in order to assess how gene expressions are regulated by these markers. At last, these samples will allow us to analyze the level of heavy metal ions, organic pollutants, nutrients, and metabolites in the twin pairs.
The twin birth surveillance system developed by our health center is also used for the third sample. This system is mainly used for cohort management and information collection, such as subject recruitment, follow-up appointments, early warning of follow-up appointments, follow-up registration, and questionnaire fill-in. Using the surveillance system, the trained staff can recruit mothers at their first trimester of pregnancy, record their basic information, and make follow-up appointments. This system will also allow project managers to learn the quantity and the state of the participants in a timely manner.
The questionnaire obtained with the third sample collected information not available in the original database, such as income level of the family, mothers’ anxiety and depression during the first trimester of pregnancy, and children's diet information during early childhood. We also prepared the questionnaires to check children's general development at 6 months, 1 year or older. These questionnaires are completed using a specific website linked to the network tracking system developed for the WTBC. This online method is an efficient and fast way of collecting data. In addition, we will also collect health-related environmental information, such as exposure to ambient air pollution, unhealthy food, drugs, drinks, and a smoking environment. Written informed consent in the study is obtained from each mother during their first prenatal examination. The mothers are able to withdraw from the cohort at any time. The recruitment activities are taking place at the WMHCWC. Mothers can also reject the invasive tests but still remain in the study.
Preliminary Findings and Accomplishments
Table 2 provides data on twin pairs’-specific factors such as birth outcome, preterm birth, low birth weight, delivery mode, malformation, and anthropometric measurements, including height and weight. Fifty percent of the twins were born preterm and with low birth weight, 90% of the twins were born by cesarean section, and 53% of twins were boys. The median gestational age at birth was 35.7 weeks. Table 3 provides data on zygosity. Altogether, 1,143 twin pairs were monozygotic (MZ), and 3,710 twin pairs were dizygotic (DZ). Table 4 provides follow-up information about the twins from the first trimester of maternal pregnancy to 6 years (with at least one physical examination result for each year). The follow-up rate was 73.15% for 0–1 year, 60.50% for 1–2 years, 52.04% for 2–3years, 37.12% for 3–4 years, 20.29% for 4–5 years, 16.14% for 5–6 years, and 10.07% for 6–7 years.
Our research on the WTBC has focused on constructing up-to-date, sex-specific birth weight references by gestational age for twin births in China. Such references for twin births are currently not available in this country. We conducted a population-based analysis of the data of 22,507 eligible living twin infants with births dated from August 1, 2006, to August 31, 2015, from all 95 hospitals of Wuhan. Gestational age in complete weeks was determined by using a combination of last menstrual date-based estimation and ultrasound examination. Smoothed percentile curves were created by the lambda mu sigma method. References of the 3rd, 10th, 25th, 50th, 75th, 90th, and 97th percentiles birth weight by sex and gestational age were made by using 11,861 male and 10,646 female twin newborns with a gestational age from 26 to 42 weeks. Separate birth weight percentile curves for male and female twins were constructed. In summary, we developed specific twin birth weight curves, which are expected to be particularly useful for assessment of birth weights of twin birth (Bin et al., Reference Bin, Cao, Zhang, Yao, Xiong, Zhang and Zhou2016).
Strengths and Weaknesses of the WBTC
This cohort study, based on the twin birth registry in China, came about to address the lack of knowledge with regard to the extent of genetics and environmental causes of variation in traits on twins’ development, and the scarcity of data addressing the extent of environmental and genetic influence on epigenetics from gestation to childhood in China. It is also a unique resource for clinical, epidemiological, and genetics studies. Some other strengths of the WTBC are that it is population based and has a large sample size. Furthermore, it is a longitudinal birth cohort where biospecimens have been taken from pregnancy to early childhood and can also be collected at any subsequent time in the future. Data on health and lifestyle, such as physical characteristics, mental health, and behavior of both twins and their mothers are collected prospectively, thus very little recall bias occurs during the data collection process. Due to the advent of our birth registry, the participants in our study can be easily tracked. We are able to use the registry data for analysis, to determine whether there are potential differences in basic characteristics between participants and non-participants, and also achieve proper randomization, thereby ensuring that the sample obtained is representative of the population. In addition, the WTBC has an accurate classification on zygosity for all twin pairs in the third sample and most of the twin pairs in the second sample. This level of accuracy is not common in twin studies. The WTBC has been established by a multidisciplinary team encompassing expertise in epidemiology, pediatrics, obstetrics, biostatistics, and clinical laboratory science. One of the weaknesses of our study is that we have not completed zygosity assessment for twin pairs of the first sample and part of the second sample. We are planning to carry out this assessment in the near future through the questionnaire method by telephone.
Accessibility of the Data
We are currently seeking potential collaborators with the goal of exploring complex gene–gene and gene–environment interactions using novel molecular and genetic methodologies and technologies. The data collection step of our cohort is still ongoing and growing. We welcome international and domestic collaborations, as well as with academic- and industry-based researchers. External researchers can get access to the data via a collaboration agreement with the WTBC steering group. For more information on how to apply, please contact the principal investigator D. Bin Zhang (firstname.lastname@example.org). The WTBC home page (http://www.whfuyou.com:8090/dl/) provides further information.
We would like to thank all of the twins and their mothers who participated in the study. We are also extremely grateful to all the hospitals and community health centers involved in this study. The original funding sources for the registry were primarily from the National Health and Family Planning Commission of the People's Republic of China, United Nations International Children's Emergency Fund (UNICEF), and the Health and Family Planning Commission of Wuhan Municipality. Recently, the WPTBR received funding from the Gates HBGDki project and significant funding for a project entitled ‘Wuhan Twin Birth Cohort’ (WTBC) from the intramural research program of the WMHCWC.
Disclosure of Interest