The epidemiology and aetiology of diarrhoeal disease in infancy in southern Vietnam: a birth cohort study

Highlights • The diarrhoeal disease burden in a large, prospective infant cohort in Vietnam is defined.• Minimum incidence of clinic-based diarrhoea in infants: 271/1000 infant-years.• Rotavirus was most commonly identified, followed by norovirus and bacterial pathogens.• Frequent repeat infections with the same pathogen within 1 year.• Inclusion of rotavirus in the immunization schedule for Vietnam is warranted.

Shigella spp, Salmonella spp, and Campylobacter spp as the aetiological agents of diarrhoea in hospitalized children under 5 years of age in Ho Chi Minh City (HCMC). 6 How well these data represent the community level burden of diarrhoeal disease is unclear. Further, these data suggest that the majority of hospitalized diarrhoea cases are in children <12 months of age, 6 which is the pivotal age group at which rotavirus vaccine should be targeted. Longitudinal community cohort studies provide an opportunity to evaluate the epidemiology and disease burden of diarrhoea to a fuller scale than hospital-based research. However, few studies have evaluated the incidence of diarrhoea in Vietnam, 3,7 and to date none have focused exclusively on the tropical south of the country. To address this knowledge gap, we sought to define the burden, aetiology, and risk factors for diarrhoeal disease through community cohorts of infants in two distinct settings in this densely populated, rapidly industrializing region. A better understanding of the epidemiology and aetiologies of diarrhoeal disease in southern Vietnam will inform rational public health interventions.

Description of the cohort
The cohort structure and methodology have been described previously. 8 Briefly, pregnant women were enrolled from 2009 to 2013 in southern Vietnam in two locations: women resident in central HCMC, the largest city in southern Vietnam, were enrolled at Hung Vuong Obstetric Hospital in HCMC; women resident in Cao Lanh District, Dong Thap Province, which is 120 km southwest of HCMC and situated in a semi-rural setting, were enrolled at Dong Thap Provincial Hospital. After delivery, infants were enrolled and followed up for the first 12 months of life with routine visits at 2, 4, 6, 9, and 12 months of age. A brief questionnaire detailing growth and illness in the preceding period since the last visit was administered, and a series of samples (blood, throat swab, nasopharyngeal swab) was collected at each routine visit.

Diarrhoeal episode detection
During the 12 months of follow-up, passive detection of diarrhoeal illness was performed, in which families were asked to take their child to a designated study clinic if the infant was unwell. At presentation, a brief clinical report was collected, as well as a stool sample. If the child was admitted, a detailed clinical evaluation was recorded. Blood samples were collected at the discretion of the treating physician. A new episode of diarrhoea was defined by !7 days between the onset dates of symptoms. Diarrhoea was defined as three watery loose stools or at least one bloody/mucoid diarrhoeal stool within 24 h, 9 or an increase in stool frequency as determined by the parent's judgement.
A secondary source of data on diarrhoeal episodes were selfreports by the mother of diarrhoeal illness in their infant for the period prior to each study visit.

Laboratory analysis
Stool samples collected from diarrhoeal episodes were stored at 4 8C until transport within 24 h and were then stored at À80 8C until further testing. One-step reverse transcriptase (RT) PCRs for rotavirus and norovirus genogroups I and II (GI and GII) were performed using RNA Master Hydrolysis Probes (Roche Applied Sciences, UK) on a LightCycler 480 (Roche Applied Sciences, UK) with the primers and probe sequences and PCR cycling conditions described previously. 10 Real-time PCR cycling conditions for Shigella (target ipaH) and Campylobacter (Campylobacter jejuni target: hipO; Campylobacter coli target: glyA) were as follows: 95 8C for 15 min, followed by 40 cycles of 95 8C for 5 s, 60 8C for 30 s, 72 8C for 30 s, as described previously. 11,12 Salmonella was detected using an in-house assay targeting the invA gene, which is conserved across the eight Salmonella subspecies, with cycling conditions as follows: 95 8C for 15 min, followed by 45 cycles of 95 8C for 5 s, 60 8C for 60 s. The sequences of the primers and probe for the invA gene were as follows: forward 5 0 -TCATCGCACCGT-CAAARGA-3 0 , reverse 5 0 -CGATTTGAARGCCGGTATTATT-3 0 , probe: 5 0 -FAM-ACGCTTCGCCGTTCRCGYGC-BHQ1-3 0 . The limit of detection was 5 copies/reaction. Stool samples were not available from self-reported diarrhoea episodes.

Statistical analyses
Two separate incidence measurements were calculated: one evaluating diarrhoeal presentations at a study clinic and/or admitted to hospital, and the other based solely on self-reported diarrhoeal illness derived from information collected at the routine follow-up visits. These data were not merged. Infantyears of observation (IYO) for each infant were derived from the date of birth and date of exit from the study due to either completion of follow-up, documented early withdrawal, or loss to follow-up, defined by the last routine visit or illness presentation, whichever was later, if the full 12-month follow-up period was not completed. Pathogen-specific incidence estimates were not calculated due to low counts, but the incidence of aetiological groups (bacterial, viral, or mixed infection) was evaluated. Comparisons between groups were made using the Kruskal-Wallis test for continuous variables with non-normal distributions and the Chi-square test for categorical variables.
Multivariable negative binomial regression was used to identify risk factors associated with severe diarrhoea presenting to a study clinic and/or admitted to hospital. Regression was performed independently for each study site due to the heterogeneity in risk profiles between HCMC and Dong Thap. Factors were included in the multivariable model according to hypothesized associations determined a priori (maternal characteristics, socioeconomic indicators, household elevation), as well as those found to be significantly associated in the univariable analysis (p < 0.05). All analyses were performed in Stata v. 13 (StataCorp, College Station, TX, USA).

Spatial clustering analyses
To investigate the presence of spatial clustering of diarrhoeal illness, we used a Bernoulli model with all diagnosed episodes of diarrhoea as cases, and children without any reported history of diarrhoeal episodes as the background population using SaTScn v. 9.1.1 (http://www.satscan.org/). Each pathogen in turn was also considered as a case, with the control group remaining all children in the cohort with no reported episode. For the analyses, the upper limit for cluster detection was specified as 50% of the study population. The significance of the detected clusters was assessed by a likelihood ratio test, with a p-value obtained by 999 Monte Carlo simulations generated under the null hypothesis of a random spatiotemporal distribution.

Ethics
Four hospitals in HCMC (Hospital for Tropical Diseases, Hung Vuong Obstetric Hospital, District 8 Hospital, Children's Hospital 1) and Dong Thap Provincial Hospital participated in the study. The protocol was approved by the institutional review boards of all these hospitals, as well as the Oxford Tropical Research Ethics Committee. Written informed consent was obtained from all participants.

Baseline characteristics of the cohorts
From July 2009 to December 2013, a total of 6706 infants were enrolled in the birth cohort from 6679 mothers (27 sets of twins). A total of 6239.4 infant-years of observation (IYO) were recorded for these children. In Dong Thap, there were 2458 infants enrolled with 2199.4 IYO, and in HCMC there were 4248 infants enrolled with 4040 IYO. The full 12-month follow-up was completed by 87% of the cohort, with 33% (289/884) of early exits occurring after at least 9 months of cohort membership. Slightly over half of enrolled babies were male (52%), with roughly 5% being of low birth weight (<2500 g) ( Table 1). The majority of children (91%) were breastfed after birth; 33% were exclusively breastfed. The use of milk formula after birth was more frequently reported in HCMC (92%) compared with Dong Thap (26%). Households in Dong Thap were more likely to have characteristics of lower socioeconomic status compared to HCMC, with a higher prevalence of household crowding, a lack of flush toilets, use of river water as the primary water source, and lower maternal education level (Table 1).

Incidence of diarrhoeal disease
During the follow-up period there were 1690 diarrhoeal presentations detected through clinic-based surveillance. The majority of these illnesses were treated on an outpatient basis (91.4%). The minimum incidence of diarrhoeal presentations estimated for the cohort as a whole was 271/1000 IYO. In Dong Thap, this figure was 604.3/1000 IYO and in HCMC was 89.4/1000 IYO. The minimum incidence estimates for hospitalized diarrhoeal illness in each location were 57.3/1000 IYO and 4.5/1000 IYO, respectively. There were 1656 self-reported diarrhoeal episodes at routine follow-up visits, corresponding to an incidence of 265.4/1000 IYO for the entire cohort. The incidence of self-reported diarrhoea was similar between the study sites: in Dong Thap it was 318.3/1000 IYO and in HCMC it was 236.6/1000 IYO.
Stool samples were collected from a far greater proportion of diarrhoeal episodes in the Dong Thap cohort compared to the HCMC cohort (86% vs. 47%). For inpatient diarrhoeal episodes in particular, the completeness of stool sample collection was far higher in Dong Thap (103/126; 82%) than in HCMC (5/18; 28%). This was due to difficulties in identifying hospital admissions of cohort members in real time in HCMC. The proportion of samples positive for at least one pathogen did not differ between sites, but was collectively higher among inpatient samples than outpatient samples (69% vs. 56%). The distribution of aetiologies differed significantly between HCMC and Dong Thap (Chi-square p < 0.001), with viral infections more common in HCMC and bacterial and mixed viral/bacterial infections more common in Dong Thap ( Figure 1A). Mixed viral/bacterial infections were more common among hospitalized diarrhoeal cases than outpatients, however the overall distribution of aetiologies was not significantly different between outpatients and inpatients (Chi-square p = 0.09; Figure 1B). Among all detected diarrhoeal episodes, infections with a mixed viral/bacterial aetiology were most likely to be admitted to hospital (26%, 35/133), followed by viral infections (17%, 67/391).
Repeat infections with the same pathogen were identified in a subset of infants. Rotavirus was identified in 365 infants, 32 of whom (9%) had at least two discrete rotavirus infections separated by at least 7 days. This proportion was the same for norovirus (15/163), Shigella (10/108), and Campylobacter (12/141). Of the 120 infants with Salmonella infection, 15 (13%) had at least two distinct episodes where Salmonella was detected. Figure 2 shows the distribution of the interval between repeated infections, by pathogen. The median interval between repeated infections ranged from 37 days for Salmonella to 106 days for norovirus,

Clinical characteristics by aetiological group
Amongst all 1690 diarrhoeal episodes detected by clinic-based surveillance, the median age of the affected infants was 6.5 months (interquartile range (IQR) 4.6-8.7 months). A total of 55% (n = 934) of all diarrhoeal cases were male. Amongst episodes with an identified aetiology, infants with mixed infections tended to be slightly older (median 8 months) compared to those with the other aetiological groups ( Table 2 ). The median axillary temperature at hospital admission was 37.8 8C (IQR 37-38.5 8C), which did not  The numbers below each pathogen label indicate the total number of secondary or tertiary infections for that pathogen. The average length of stay in hospital for all admitted diarrhoeal episodes was 5 days (IQR 3-7 days).

Risk factors for diarrhoeal disease
Risk factors for diarrhoea were investigated by site. In the unadjusted analysis, increased maternal education was protective against diarrhoea in HCMC, whereas male sex, household crowding, use of a piped water supply, and filtering drinking water were all significant risks (Table 3). In a multivariable analysis, maternal education (incidence rate ratio (IRR) 0.75, 95% confidence interval (CI) 0.56-1.00) remained independently associated with protection, and household crowding (!2 people/ room; IRR 1.45, 95% CI 1.07-1.95) along with filtering drinking water (IRR 1.81, 95% CI 1.17-2.81) remained risk factors in this setting.
In Dong Thap, the most important protective factors included maternal age at delivery, maternal education, and filtering of the drinking water supply (Table 4). Male sex and the lack of a flush toilet were risk factors in this setting. After adjusting for confounding, male sex remained the only strongly associated risk factor (IRR 1.20, 95% CI 1.04-1.40), and maternal age (IRR 0.98, 95% CI 0.96-0.99) and education (IRR 0.75, 95% CI 0.62-0.91) remained protective.

Spatial clustering
As shown in Figure 3, in Dong Thap there was evidence of spatial clustering for each detected pathogen. For all-cause diarrhoea, a cluster was identified with a radius of 6.7 km in the northwest region of the study area (relative risk (RR) 1.79, p < 0.001). All of the pathogen-specific clusters centred generally around the same area, in the more rural part of the Dong Thap study area, with radii ranging from 6.6 km (rotavirus) to 12.4 km (Campylobacter) and RRs from 2.3 (rotavirus) to 3.7 (Shigella). No significant spatial clustering was identified in HCMC (data not shown).

Discussion
Diarrhoea remains one of the most common yet preventable conditions affecting the poorest children globally. 1 Through a large, longitudinal birth cohort, a substantial burden of diarrhoeal disease in the first year of life was identified in southern Vietnam, with an estimated minimum incidence of 271/1000 IYO. This is an order of magnitude less than an estimate in infants aged <12 months from the late 1990s in rural Hanoi (3.3/child/year), 7 yet it is higher than the incidence estimated in children under 5 years of age in central Vietnam in 2001-2003 (115/1000 child-years). 3 Differences in disease incidence may have arisen from study design, as the study from rural Hanoi included partially active surveillance. Furthermore, although Dong Thap seemingly had a much higher minimum incidence (604/1000 IYO) than HCMC (89/1000 IYO), the large difference is very likely due to underascertainment in HCMC, as the number of healthcare providers in this urban setting is much greater than in semi-rural Dong Thap, 13 and cohort participants therefore had greater opportunity to seek care at non-study clinics.
Viral infections represented the largest burden amongst all diagnosed diarrhoeal presentations in this study, confirming an earlier hospital-based study in HCMC. 6 The distribution of aetiologies between the two sites was comparable, with mixed viral infections identified more frequently in HCMC. This may be confounded by under-ascertainment of hospitalized cases, in particular in HCMC, since hospitalized cases were more likely to be bacterial. Campylobacter was the most frequently detected bacterial pathogen in our cohort, which is in contrast to the recently published Global Enteric Multicenter Study (GEMS), which identified Shigella as the third most common cause of disease, behind rotavirus and Cryptosporidium, in moderate to severe diarrhoea in the first year of life across seven different Asian and African countries. 2 As no control specimens were collected from healthy children in the present study, the aetiological role of the detected organisms cannot be determined. However, results from a hospital-based study in HCMC suggest that these organisms are not frequently identified in children without diarrhoea, with only 13% of approximately 600 non-diarrhoeal controls positive for an enteric pathogen. 6 Through this work a large burden of potentially vaccinepreventable rotavirus disease was identified in infants. Over half of all samples with an identified aetiology were positive for rotavirus, with 13% of all rotavirus episodes admitted to hospital. Rotavirus vaccine is available as a 'user pays' product in Vietnam (predominantly the Rotarix monovalent vaccine (GlaxoSmithKline)), but uptake is low due to the prohibitive cost (US$ 70-80) and a lack of vaccine availability in many regions, including Dong Thap. Only 24% of children in HCMC were vaccinated against rotavirus in our cohort. The Vietnamese Ministry of Health has sponsored a locally produced, live-attenuated monovalent rotavirus candidate vaccine, with some success in the early stages of clinical evaluation. 14 Previous work has shown that rotavirus vaccination, if GAVI-subsidized, would be cost-effective in Vietnam, 15 and safe when co-administered within the current expanded programme on immunization (EPI) structure. 16 Furthermore, immune responses (IgA and serum neutralizing antibody) measured against the pentavalent vaccine (RotaTeq) in Vietnamese children were shown in one study to be comparable to those among children in Latin America and Europe. 17 This suggests that rotavirus vaccination in Vietnam may not suffer from the same level of reduced immunogenicity that has been observed to occur with orally administered enteric vaccines in developing countries. 18 The majority of the identified infections were found to have occurred after 6 months of age, potentially due to the waning of protective maternal antibody and generally high rates of breastfeeding after birth, 19,20 and increased exposure to pathogens with the start of consumption of solid foods. The risk factors for diarrhoeal disease identified through this work, including household crowding, low maternal age, and male sex, are generally consistent with the literature. 4,5 In HCMC, drinking filtered piped water was a significant risk for diarrhoeal disease, although the number of families reporting filtering was relatively low. This may be due to the use of ceramic filters that have pores too large to mechanically prevent viruses from entering the drinking water supply. 21 The absence of a measurable protective effect of rotavirus vaccination in HCMC likely reflects the imperfect case ascertainment, as well as the fact that approximately 50% of diarrhoeal episodes with a known aetiology were associated with pathogens other than rotavirus. The identification of increased spatial risk for diarrhoeal disease in the north-western region of Cao Lanh District in Dong Thap may represent a hotspot of transmission, due potentially to poor sanitation or waste management practices.
The most important limitation in this work was the passive nature of diarrhoeal disease episode detection. Although the staff made every effort to ensure disease episodes were recorded, an unknown number of infants with diarrhoeal disease may have attended clinics other than ours, especially in HCMC, and it is acknowledged that the interpretation of the present results is dependent on this limitation. Therefore, the minimum incidence measurements herein likely underestimate the true burden, particularly in HCMC, and the risk factor and spatial analyses may be biased by misclassification of some infants with undetected diarrhoeal illness. This may also have affected the conclusions on diarrhoeal aetiology, if the distribution of pathogens among episodes from which no specimen was available differed from those specimens tested. The overall loss to follow-up rate was low, although such bias may also be present and important to consider. Finally, the number of pathogens screened for was limited and may explain the lack of an identified pathogen in almost half of the cases. In particular, screening was not performed for any viruses beyond norovirus and rotavirus, and parasites and diarrhoeagenic Escherichia coli, which are known to be prevalent amongst children with diarrhoea in industrializing countries, were not investigated. 2 Further work to more fully determine the epidemiology of diarrhoeal disease in this setting is warranted, particularly in the face of emerging antimicrobial resistance. 6,22 Active, communitybased surveillance of high-risk populations would provide a more accurate estimation of the true extent of the burden. Furthermore, as roughly 40% of diarrhoeal episodes collected in the present cohort study lacked a final diagnosis, investigation into the prevalence of additional pathogens, particularly Cryptosporidium, 2 would help local clinicians to better understand the range of potential aetiologies and corresponding therapies for their patient population. To explore these questions, enrolment into a cohort study of young children aged 1-5 years, as an extension of this birth cohort study, has recently been completed, which includes active surveillance for diarrhoeal disease and diagnosis of viral and bacterial gastrointestinal pathogens. 23 Through this, it will also be possible to explore the relative pathogenicity of isolated organisms as well as distinguish reinfection from long-term carriage, due to the collection of stool from healthy children as well.
In conclusion, the most comprehensive epidemiological description of paediatric diarrhoea in infancy in southern Vietnam, to date, is presented herein. A high burden of diarrhoeal disease in infants under the age of 12 months in both an urban and semi-rural setting is documented, with a large proportion due to vaccinepreventable rotavirus infection. Future efforts to integrate either a GAVI-subsidized or a domestically produced rotavirus vaccine into the national EPI schedule should be pursued.