Reduced transmission of Mycobacterium africanum compared to Mycobacterium tuberculosis in urban West Africa

Highlights • The estimated recent tuberculosis (TB) transmission rate (clustering rate of 41.2%) was found to be high in Ghana.• There is a need for increased TB awareness by the national tuberculosis control program.• Mycobacterium africanum (MAF) transmits at a lower rate compared to Mycobacterium tuberculosis in Ghana.• The incidence of MAF remained fairly constant over the study years.• Other factors may likely be responsible for maintaining MAF in West Africa.


Introduction
Tuberculosis (TB) is a global health emergency; in 2016 an estimated 10.4 million people got sick, while 1.7 million died of TB (WHO, 2017). In 1993, the World Health Organization (WHO) declared TB a global health emergency and called for more efforts and resources to fight TB. Due largely to the inefficacy of the bacillus Calmette-Guérin (BCG) vaccine against pulmonary TB in adults, the current TB control strategy relies on case detection and treatment under the directly observed therapy short course (DOTs) strategy. The conventional indicators used to assess national control programs under this strategy focus on the proportion of cases that are cured at the end of treatment or whose sputum microscopy becomes negative after the first 2 months of treatment. Such indicators ignore equally important aspects of TB control, which include the duration of infectivity, the frequency of reactivation, and the risk of progression among the infected contacts, as well as the proportion of TB due to recent transmission.
Understanding transmission dynamics will contribute to knowledge on factors that enhance the spread of the disease, which is useful for developing preventive interventions. Molecular epidemiological studies have been very useful in a number of countries, identifying populations at risk and areas of high transmission, as well as providing much understanding on the prevalence of different Mycobacterium tuberculosis complex (MTBC) strains with varied virulence and drug resistance rates (Anderson et al., 2014;Malm et al., 2017;Seto et al., 2017;Varghese et al., 2013;Walker et al., 2014;Yang et al., 2016). These studies have shown that the dynamics of TB transmission vary greatly geographically. Even though Africa harbors a large proportion of the global TB cases, with a current incidence of 254 per 100 000 population (WHO, 2017), population-based molecular epidemiological studies needed to understand transmission patterns are rare. The few studies conducted have not been population-based and have lacked an in-depth analysis of the transmission dynamics of MTBC strains belonging to different lineages Glynn et al., 2010;Mulenga et al., 2010).
The molecular typing toolsspacer oligonucleotide typing (spoligotyping) and mycobacterial interspersed repetitive unit variable number tandem repeat (MIRU-VNTR) typinghave been used successfully for strain differentiation in TB transmission studies due to their combined high discriminatory power and reproducibility; furthermore, in combination with epidemiological data, they have been used for the detection of recent TB transmission and outbreaks (Anderson et al., 2014;Barnes and Cave, 2003;Maguire et al., 2002;Surie et al., 2017;Varghese et al., 2013). Currently, the high cost and expertise needed for whole genome sequencing and analysis have precluded its use in population-based studies, and considering capacity building in a low-resource setting like Ghana, spoligotyping and MIRU-VNTR typing remain good alternatives.
TB in humans is caused mainly by Mycobacterium tuberculosis sensu stricto (MTBss) and Mycobacterium africanum (MAF), which are further divided into seven lineages: MTBss lineages 1-4 and 7 (L1-L4 and L7); MAF lineages 5 and 6 (L5 and L6) (Blouin et al., 2012;de Jong et al., 2010). While MTBss is distributed globally, MAF is restricted to West Africa, where it is responsible for up to 50% of TB cases (Gagneux and Small, 2007). Nevertheless, reports mainly from the Gambia where L6 is prevalent, suggest MAF is attenuated compared to MTBss, hence could be outcompeted by MTBss (de Jong et al., 2010(de Jong et al., , 2008Kallenius et al., 1999). However, an 8-year study recently conducted in Ghana found the prevalence of MAF to be fairly constant at approximately 20%, indicating that MAF and MTBss may be transmitted equally (Yeboah-Manu et al., 2016). The objective of this study was to determine the transmission dynamics of TB caused by MTBss and MAF in Ghana.

Study design and population
This study was a population-based prospective study in which sputum samples were collected from consecutive clinically diagnosed pulmonary TB patients reporting to 12 selected health facilities within an urban setting (Accra Metropolitan Assembly (AMA)) and the rural setting of East Mamprusi District (MamE) (Supplementary material, Figure S1). The study was conducted from July 2012 to December 2015. A pulmonary TB case was defined as an individual with a case of TB that was confirmed both clinically and bacteriologically. Detailed demographic and epidemiological data were obtained from consented participants.

Mycobacterial isolation, species identification, and drug susceptibility testing
The sputum samples were decontaminated and cultured on Lowenstein-Jensen medium to obtain mycobacterial isolates. These isolates were confirmed as MTBC by detecting the MTBCspecific insertion sequence IS6110 using PCR (Yeboah-Manu et al., 2001). In vitro drug susceptibility to isoniazid and rifampicin were determined using either the microplate Alamar Blue cell viability assay, as described elsewhere , and/or the GenoType MTBDRplus assay (Hain Lifescience), following the manufacturer's protocol (Barnard et al., 2008). Figure 1. Pipeline for recruited participants and culture-positive TB cases included in the clustering analysis. *Category described as untypeable for MIRU-VNTR includes isolates with !2 MIRU loci unamplified (n = 164, 71.3%) and isolates with a double allele at !2 MIRU loci (n = 66, 28.7%). These isolates were described as suspected mixed infection or laboratory contamination and hence were excluded from further analysis. #Frequency was expressed as the total number of Mycobacterium tuberculosis complex (MTBC) isolates obtained.

Lineage and strain classification
Lineage and strain classification of the MTBC was achieved in a stepwise manner using large sequence polymorphism typing identifying regions of difference 4, 9, 12, 702, and 711 (de Jong et al., 2010;Gagneux and Small, 2007), single nucleotide polymorphism typing, spoligotyping (Kamerbeek et al., 1997), and MIRU-VNTR typing (Supply et al., 2006). For MIRU-VNTR typing, a customized set of 8 MIRU loci was first used, as described by Asante-Poku et al. (2014), and clustered cases were resolved by analyzing the remaining 7 loci of the standard MIRU-15 loci set (Supply et al., 2006). All assays were well controlled with PCR amplifications and pre-PCR procedures conducted in physically separated compartments to avoid laboratory cross-contamination. The presence of more than one allelic repeat number (multiple allele) for any given locus is suggestive of laboratory cross-contamination, multiple strain infection, or microevolution of a single strain. To prevent bias resulting from cross-contamination and multiple strain infection, isolates with multiple alleles at more than one MIRU locus (described as 'untypeable') were excluded from further analysis. Isolates with only one multiple allele at any given locus were, however, included due to the possibility of microevolution.
The spoligotyping patterns and assigned shared type numbers obtained were defined according to the SITVITWEB database (http://www.pasteur-guadeloupe.fr: 8081/SITVIT_ONLINE/), while sub-lineages were assigned based on the MIRU-VNTRplus database (http://www.miru-vntrplus.org) (Allix-Beguec et al., 2008). Strains with no lineage nomenclature data were further identified using the TB lineage database (Shabbeer et al., 2012) or otherwise regarded as orphan strains. A strain was defined as an MTBC isolate with a unique molecular signature, and thus a unique spoligotype pattern and/or a unique MIRU-VNTR allelic pattern for the number of investigated MIRU loci.

Clustering analysis and risk factor assessment
Clustering analysis was performed using the categorical parameter and the unweighted pair group method with arithmetic mean (UPGMA) coefficient from a constructed phylogenetic tree using the online MIRU-VNTR tool. Clustering analysis was based on the assumption that strains with the same DNA fingerprint may be epidemiologically linked and associated with recent TB transmission (Hall, 1996). A cluster was defined as two or more isolates (same strain) that share an indistinguishable spoligotype and 15locus MIRU-VNTR allelic pattern, but allowing for one missing allelic data at any one of the difficult-to-amplify MIRU loci (VNTR 2163, 3690, and 4156). The size of a cluster was also defined using the total number of isolates in the cluster classified into categories of small (2 isolates), medium (3-5 isolates), large (6-20 isolates), and very large (>20 isolates).
The recent transmission rate was estimated using the n À 1 formula (Glynn et al., 1999): ðncÀcÞ n , where nc is the total number of clustered cases, c is the number of clusters, and n is the total number of cases in the sample.
Only one strain per participant was included in the analysis, and follow-up cases were excluded. The clustering analysis was stratified first by location and then by MTBC lineage. The spatial distribution and clustering among all of the observed Spoligo/MIRU strain types were studied by constructing a minimum spanning tree (MST) with Bionumerics software (Applied Maths, Sint-Marteen-Latem, Belgium).

Data management and analysis
Both molecular and epidemiological data were analyzed. Epidemiological data retrieved from all participants with positive MTBC cultures were included in the analysis while excluding data from those with no growth, contaminated cultures, and isolated nontuberculous mycobacterial species. All statistical analyses were conducted using the Stata statistical package version 14.2 (Stata Corp., College Station, TX, USA). The association of specific lineages and/or sub-lineages of the MTBC with time and/or geographical locations were explored using the Chi-square test and a logistic regression model. For the determination of independent predictive factors for recent TB transmission, a multivariate analysis (forward stepwise approach with a probability entry of 0.1) was conducted using a logistic regression model while estimating the odds ratios (OR). p-Values of <0.05 were considered significant.
The study is reported according to the Strengthening the Reporting of Molecular Epidemiology for Infectious Diseases (STROME-ID) guidelines (Field et al., 2016).

Characteristics of study participants
A total 3303 sputum smear-positive pulmonary TB cases were recruited, 382 (11.6%) from the rural setting and 2921 (88.4%) from the urban setting; 2604 (78.8%) MTBC isolates were obtained from these cases (Supplementary material, Table S1). After excluding 13 Mycobacterium bovis and isolates that were untypeable (described in the Methods section), 2309 of 2604 isolates (88.7%) were included for clustering analysis. The participants comprised 1631 (71%) males and 663 (29%) females (there was no record of sex for 15 participants) with a median age of 39 years (range 3-91 years) and 33 years (range 4-90 years), respectively (Figure 1; Supplementary material, Table S1). The male-to-female ratio observed was comparable to the national average of approximately 2:1.
Of the 2309 participants with MTBC genotyping results, 201 (8.7%) were from the rural setting and 2108 (91.3%) from the urban setting. Among this study cohort, 7.4% (184/2482) of participants were previously treated cases including relapse, which is similar to the national value of 7.0% (WHO, 2015). Seventy-one percent (1561/2208) presented with a sputum smear microscopy bacterial burden result of at least 2+ and 33% (544/1665) admitted having contact with at least one TB patient. In a multivariate logistic regression analysis, it was found that male patients were less likely to be infected with a L5 strain (adjusted OR 0.7, 95% confidence interval (CI) 0.5-0.9) and individuals living in villages were more likely to be infected with a L6 strain (OR 6.6, 95% CI 1.2-36.1) (Supplementary material, Table S2).
Of the 2309 isolates included for clustering analysis, 1227 (53.1%) isolates clustered in 276 different clusters with a mean cluster size of 4 (range 2-35) and 1082 (46.9%) unique isolates were identified, giving a total of at least 1358 different MTBC strains circulating within the study population (Table 2a). Using the n À 1 method, the overall clustering rate (reflecting the recent transmission rate) was    (3) HIV (2) Other (2) L4 (Haarlem) (2) HIV (1) Other (3) L4 ( (Figure 3). There was no significant difference in the clustering rate between the Cameroon and Ghana sub-lineages (p = 0.57) (Figure 3). While no significant difference in the recent transmission rates was seen between members of MAF (L5 and L6, p = 0.118), it was found that L4 was transmitted significantly more (p < 0.001), with seven of its clusters having very large cluster sizes (>20 isolates per cluster) made up of the Ghana sub-lineage (four very large clusters) and Cameroon sub-lineage (three very large clusters) (Figure 3; Supplementary material, Figure S2). Notwithstanding the lower transmissibility of L5 and L6 compared to L4, four large clusters were also observed for each of these lineages. The urban and rural settings had estimated recent transmission rates of 41.7% and 9.0%, respectively.

Exploring the diversity and clustering within the MTBC lineages
Very large molecular clusters (clusters with >20 isolates; defined in the Methods section) were observed for L4, in addition to one strikingly large cluster belonging to the Beijing family of lineage 2 ( Figure 4; Supplementary material, Figure S3). Generally, only a few multidrug-resistant MTBC strains were observed across all the major lineages (Supplementary material, Figures S4-S6). There was no single large cluster with all isolates being multidrug-resistant (Supplementary material, Figure S4). The spatial distributions of the isolates constituting each cluster stratified by study setting are shown in the Supplementary material, Figures S7-S9.

Molecular epidemiology and factors associated with clustering: logistic regression modeling
Risk factors associated with recent TB transmission were sought. A total of 675 individuals belonging to either large (6-20 isolates) or very large (>20 isolates) molecular clusters were identified, with a combined median cluster size of 14 (range 6-35). The majority of the individuals belonging to very large clusters were male, with a male-to-female ratio of approximately 3:1, significantly higher than the 2:1 ratio observed in the general TB patient population (p = 0.022). Three large clusterscluster ID MSC4193, MSC5003.X, and MSC4107, with cluster sizes of 9, 7, and 7 respectivelyinvolved only male subjects (Table 3).
Epidemiological investigations revealed both localized and dispersed recent transmission among the clustered cases, with suggested evidence of household transmission in at least six large clusters (MSC4063.X, MSC2001, MSC4095, MSC4063.18, MSC4069. X, and MSC4104). Specifically, the same L4 strain (part of cluster MSC4069.X) was found among three individuals belonging to the same household, with the oldest person (age 49 years) reporting having contact with his son who had TB 4 months prior to his episode (suggestive of household transmission). The majority of the large clusters involved TB strains circulating over almost the entire study period (Supplementary material, Figure S10). Apart from three Ghana sub-lineage clusters (MSC4104, MSC4031, and MSC4095) and one L6 cluster (MSC6004), with respectively 60% (6/ 10), 42% (11/26), 38% (9/24), and 43% (3/7) of isolates showing resistance to rifampicin and/or isoniazid (Table 3), such high levels of drug resistance were not observed in the other large and very large clusters. Only 2% of the isolates belonging to large and very large clusters were multidrug-resistant TB strains and this was significantly lower than that for small (2 isolates) and medium (3-5 isolates) (4%) clusters (p = 0.031).
For the determination of possible factors associated with recent TB transmission, a general logistic regression model including all MTBC lineages was first performed, using the event of belonging to a clustered case as the outcome variable and participant variables as possible predictors (Table 4). In a separate logistic regression model, risk factors associated with recent TB transmission were tested stratified independently by L4 and L5 (Table 5), excluding L6 due to the limited sample size. In the multivariable analysis for the general logistic regression model, it was found that harboring either an isoniazid-or rifampicin-resistant MTBC strain (adjusted OR 0.7, 95% CI 0.5-0.9) was associated with a lower odds of belonging to a clustered case (Table 4). All other factors such as education status, occupation, income level, ethnicity, religion, and HIV status had no association with recent TB transmission.
Finally, using adjusted predictions, it was found that the probability of belonging to a clustered case decreased with age and increased with the number of TB contacts ( Figure 5). In a separate logistic regression analysis, including age as a continuous variable with belonging to a clustered case as the outcome variable, it was found that each year increase in age was significantly associated with an approximately 1% (95% CI 0.13-2.00%) decrease in the odds of a TB patient being part of a recent transmission event (p = 0.007).

Discussion
The aims of this study were to conduct a population-based prospective molecular epidemiological study to analyze the transmission dynamics of MTBC strains circulating in Ghana and to identify risk factors associated with recent TB transmission.
A high MTBC isolate recovery rate of 78.8% was obtained, higher than that reported in similar studies (Hamblion et al., 2016;Mears et al., 2015) and this strengthens the power of the sample size to make assessments of the TB transmission rate in Ghana. This study identified a high TB clustering (recent TB transmission) rate of 41.2%, which is quite alarming, with the urban and rural areas   having estimated rates of 41.7% and 9.0%, respectively (Table 2b). These findings call for intensifying community outreach programs to encourage early case reporting and infection control. Moreover, the analysis predicted the probability of clustering to generally increase with the increase in the number of TB contacts ( Figure 5). This means that a susceptible individual is likely to have TB and be involved in a recently transmitted event as the number of TB contacts increases. Within the study population, no association of recent TB transmission was found with education status, occupation, income level, ethnicity, religion, or HIV status. However, it was observed that individuals below the age of 30 years were associated with recent TB transmission, and this is similar to observations made elsewhere (Hamblion et al., 2016;Vluggen et al., 2017). Also in this study, it was observed that each year increase in age was associated with an approximately 1% (95% CI 0.13-2.00; p = 0.007) decrease in the odds of a TB patient being part of a recent transmission event, implying that compared to younger individuals, older individuals are more likely to get active TB disease by reactivation of latent TB infection rather than through a recent transmission event (Hamblion et al., 2016). This finding puts age as a risk factor for recent TB transmission in Ghana. However, this finding was largely driven by L4 and L5, since separate analysis was not valid for L6 due to the small sample size. Furthermore, it was found that the maleto-female ratio among very large clusters was significantly higher than that observed in the general TB patient population (p = 0.022). This finding, together with the observation that some large clusters involved only male subjects, also indicates that males have a higher MTBC, Mycobacterium tuberculosis complex; TB, tuberculosis; OR, odds ratio; CI, confidence interval; JHS, junior high school; GH¢, Ghanaian cedi. a For the multivariate model, only variables with p < 0.1 and with at least 90% of available data were included. However 'locality' was excluded due to the small sample size from the rural setting. Residence classification, marital status, isoniazid mono-resistance, and MDR were excluded due to collinearity with other variables in the model. b A cluster was defined as two or more isolates (same strain) that share an indistinguishable spoligotype and 15-locus MIRU-VNTR allelic pattern, but allowing for one missing allelic data at any one of the difficult-to-amplify MIRU loci. c A significant decreasing trend in the probability of belonging to a clustered case was found with increasing age category (p = 0.004). risk of recent TB transmission compared to females, suggesting that males may engage in certain social activities that predispose them to belonging to a recent transmission event.
A lower rate of multidrug-resistant TB was seen among large clustered cases compared to the general population (2% vs. 4%, p = 0.031), indicating a low multidrug-resistant TB transmissibility within the study population. This finding further suggests that the majority of drug-resistant TB cases in Ghana acquired the drug resistance during treatment, which indicates poor patient compliance (Danso et al., 2015). Moreover, it was also found that compared to drug (isoniazid and/or rifampicin)-sensitive MTBC strains, it was unlikely to find MTBC strains with isoniazid and/or  Table 4 were included in this analysis. *p < 0.05; **p < 0.001. b For the multivariate model, only variables with p < 0.1 and with at least 90% of available data were included. c A cluster was defined as two or more isolates (same strain) that share an indistinguishable spoligotype and 15-locus MIRU-VNTR allelic pattern, but allowing for one missing allelic data at any one of the difficult-to-amplify MIRU loci.
rifampicin resistance involved in a recent transmission event (adjusted OR 0.7, 95% CI 0.5-0.9). Within the study setting, a reduced transmission of MAF (L5: 31.8%, L6: 24.7%) compared to MTBss L4 (44.9%) was observed. The high recent transmission rate observed for L4 was driven by both the Cameroon and Ghana sub-lineages, with no difference in their transmissibility, hence identifying these sub-lineages as very important pathogens. The high recent transmission of the Ghana sub-lineage coupled with recently reported association with drug resistance  is of public health importance and hence calls for the national tuberculosis control program to support peripheral diagnostic laboratories with facilities to accurately detect and help control the spread of the Ghana sublineage.
The higher recent transmission rate for L4 compared to L5 and L6 may not necessarily imply the outcompeting of L5 and L6 by L4, as their relative proportions remained constant over the entire study period (Figure 2) and also based on previous reports (Yeboah-Manu et al., 2016). Despite the low transmissibility of MAF, the observed stable relative proportion over the entire study period may be because the pathogen has adapted to infecting specific host populations (possibly due to unidentified host genetic or environmental factors peculiar to some West African inhabitants), hence enabling the maintenance of a stable prevalence over time. Using adjusted predictions for the probability of clustering, it was found that MAF L5 may still have the propensity to transmit equally to lineage 4 ( Figure 5), not forgetting the confounding effect of a higher diversity in spoligotype pattern of L5 compared to L4 and hence reduced clustering of the former . Compared to L4, a significant association of L6 with individuals living in villages was found (OR 6.6, p < 0.05; Supplementary material, Table S2). The low recent TB transmission in the villages coupled with an association of L6 could be the reason why low frequencies of L6 strains were observed within the study setting.
This report could be limited by the possibility of an underestimation of the recent transmission rate resulting from the misclassification of strains as unique if they were actually clustered outside of the restricted geographic sampling site and sampling period. However, measures were taken to address the underestimation of recent TB transmission by recruiting up to 90% of the diagnosed TB cases spanning a 3.5-year period. In addition, the possibility of overestimating recent TB transmission rates is also possible considering that the basis of the clustering analysis was done using combined 15-locus MIRU-VNTR typing and spoligotyping, whereas whole genome sequencing could have offered a better resolution of strains.
Overall, the findings indicate high recent TB transmission, suggesting the occurrence of unsuspected outbreaks. The intensification of community education is recommended to improve early case reporting and infection control.

Funding
This research was funded by a Wellcome Trust Intermediate Fellowship Grant (097134/Z/11/Z) to Dorothy Yeboah-Manu. The funding source had no role in the study design, collection, analysis, and interpretation of the data, in the writing of the report, or in the decision to submit the paper for publication.

Ethical approval
The Scientific and Technical Committee and then the Institutional Review Board at NMIMR, University of Ghana (FWA00001824) reviewed and approved the study.