Reduced transmission of Mycobacterium africanum compared to M. tuberculosis in urban West Africa

Background Understanding transmission dynamics is useful for tuberculosis (TB) control. We conducted a population-based molecular epidemiological study to understand TB transmission in Ghana. Methods Mycobacterium tuberculosis complex (MTBC) isolates obtained from prospectively-sampled pulmonary TB patients between July, 2012 and December, 2015 were confirmed as MTBC using IS6110 PCR. MTBC lineages were identified by large sequence polymorphism and single nucleotide polymorphism assays and further characterized using spoligotyping and standard 15-loci MIRU-VNTR typing. We used the n-1 method to estimate recent TB transmission and identified associated risk factors using logistic regression analysis. Findings Out of 2,309 MTBC isolates, we identified 1,082 (46·9%) single cases with 1,227 (53·1%) isolates belonging to one of 276 clustered cases (clustering range; 2-35). Recent TB transmission rate was estimated to be 41·2%. While we see no significant difference in the recent transmission rates between lineages of Mycobacterium africanum (lineage-5 (31·8%); lineage-6 (24·7%), p=0·118), we found that lineage-4 belonging to the M. tuberculosis transmitted significantly higher (44·9%, p<0·001). Finally, apart from age being significantly associated with recent TB transmission (p=0·007), we additionally identified a significant departure in the male/female ratio among very large clustered cases compared to the general TB patient population (3:1 vs. 2:1, p=0·022). Interpretations Our findings indicate high recent TB transmission suggesting occurrences of unsuspected outbreaks. The observed reduced transmission rate of M. africanum suggests other factor(s) may be responsible for its continuous presence in West Africa. Funding Wellcome Trust Intermediate Fellowship Grant 097134/Z/11/Z to Dorothy Yeboah-Manu.

(CI: 0·13 -2·00, p=0·007) decrease in the odds of a TB patient being part of a recent 73 transmission event putting age as a significant risk factor. This implies that, compared to younger 74 individuals, older individuals are more likely to get active TB disease by reactivation of latent 75 TB infection. Also, we identified that the male/female ratio among very large clusters (cluster 76 size; n>20) was significantly higher than that observed in the general TB patient population (3:1 77 vs. 2:1, p=0·022) with some large clusters (cluster size between 6 and 20) involving only males. 78 Using adjusted predictions, we found that TB patients were more likely to be involved in a recent The molecular typing tools, spacer oligonucleotide typing (spoligotyping) and mycobacterial 121 interspersed repetitive unit-variable number of tandem repeat (MIRU-VNTR) typing, have been 122 successfully used for strain differentiation in TB transmission studies due to their combined high 123 discriminatory power and reproducibility, and in combination with epidemiological data, have 124 been remarkably used for the detection of recent TB transmission/outbreaks. 2,4,9,10  Lineage and strain classification of the MTBC was achieved in a step wise manner using large 160 sequence polymorphism typing identifying regions of difference 4, 9, 12, 702, and 711, 11,13 161 single nucleotide polymorphism typing, spoligotyping, 20 and MIRU-VNTR typing. 21 For MIRU-162 VNTR typing, we first used a customized set of 8-MIRU loci as described by Asante-Poku et 163 al. 22 and resolved clustered cases by analyzing the remaining 7 loci of the standard MIRU-15 164 loci set. 21 All assays were well controlled with PCR amplifications and pre-PCR procedures 165 conducted in physically separated compartments to avoid laboratory cross contamination. The 166 presence of more than one allelic repeat number (multiple allele) for any given locus is 167 suggestive of laboratory cross contamination, multiple strain infection or microevolution of a 168 single strain. To prevent bias resulting from cross contamination and multiple strain infection, 169 isolates with multiple allele at more than one MIRU loci (described as untypeable) were 170 excluded from further analysis. Isolates with only one multiple allele at any given locus were 171 however included due to the possibility of microevolution.

172
The spoligotyping patterns and assigned shared type numbers obtained were defined according to  Clustering analysis was performed using the categorical parameter and the UPGMA coefficient 182 from a constructed phylogenetic tree using the online MIRU-VNTR tool. Clustering analysis was 183 based on the assumption that, strains with the same DNA fingerprint may be epidemiologically 184 linked and associated with recent TB transmission. 25 A Cluster was defined as two or more 185 isolates (same strain) that share an indistinguishable spoligotype and 15-loci MIRU-VNTR 186 allelic pattern but allowing for one missing allelic data at any one of the difficult-to-amplify 187 MIRU loci (VNTR 2163, 3690 and 4156). We also defined the size of a cluster using the total 188 number of isolates in the cluster into categories of small (2 isolates), medium (3 -5 isolates), 189 large (6 -20 isolates) and, very large (>20 isolates).

190
The recent transmission rate was estimated using the n-1 formulae; 26

191
Where; nc is the total number of clustered cases, c is the number of clusters, and n is the total 192 number of cases in the sample. 193 We included only one strain per participant in our analysis and excluded follow-up cases.

225
Characteristics of study participants 226 We recruited 3,303 sputum smear positive pulmonary TB cases and obtained 2,604 (78·8%) 227 MTBC isolates. After excluding 13 M. bovis and isolates that were untypeable (described in 228 methods), 2,309/2,604 (88·7%) isolates were included for clustering analysis. The participants 229 included comprised 71% (1,631) males and 29% (663)  were previously treated cases, which is similar to the national value of 7·2% 1 . Seventy-one 235 percent (1,561/2,208) presented with a bacterial burden resulting from sputum smear microscopy 236 of at least 2+ and 33% (544/1,665) admitted having contact with at least one TB patient. In a 237 multivariate logistic regression analysis, we found that male patients are less likely to be infected 238 with a L5 strain (adjusted OR 0·7, 95% CI 0·5 -0·9) and individuals living in villages are more 239 likely to be infected with a L6 strain (OR 6·6, CI 1·2 -36·1) (appendix p 3).  Of the 2,309 isolates included for clustering analysis, we identified 1,227 (53·1%) isolates being 255 clustered in 276 different clusters with an average cluster size of 4 (range: 2 -35) and 1,082 256 (46·9%) singletons, giving a total of at least 1,358 unique MTBC strains circulating within our 257 study population (table 2a). Using the n-1 method, we estimated the overall clustering rate 258 (reflecting recent transmission rate) to be 41·2%. Lineages 2, 4 and 5 contributed high clustering 259 rates of 53·8%, 44·9%, and 31·8%, respectively (table 2a)    Exploring the diversity and clustering within the MTBC lineages 282 We included data from both 15-MIRU (allelic information) and spoligotyping pattern to  Next we looked for risk factors that may associate with recent TB transmission. We identified a 294 total of 675 individuals belonging to either large (6 -20 isolates) or very large (>20 isolates) 295 molecular clusters with a combined median cluster size of 14 (range 6 -35). Majority of the 296 individuals belonging to very large clusters were males with a male to female ratio of 297 approximately 3:1, significantly higher than the 2:1 ratio observed in the general TB patient 298 population (p=0·022). Three large clusters; cluster ID MSC4193, MSC5003·X and MSC4107 299 with cluster sizes of 9, 7 and 7 respectively, involved only males (table 3).

(58·0) Reference
We included in this analysis, only variables with p<0·1 from the general logistic regression model in table 4 358 For the multivariate model, we included only variables with p<0·1 and with at least 90% of available data.

359
*p<0·05; **p<0·001 360 # A Cluster was defined as two or more isolates (same strain) that share an indistinguishable spoligotype and 15-loci 361 MIRU-VNTR allelic pattern but allowing for one missing allelic data at any one of the difficult-to-amplify MIRU 362 loci.

365
The aims of this study were to conduct a population-based prospective molecular 366 epidemiological study to analyze the transmission dynamics of MTBC strains circulating in 367 Ghana and to identify risk factors associated with recent TB transmission. The current study also 368 offered us the opportunity to create a local repository of strain types for future reference to help 369 monitor TB programs by analyzing the trends in estimates of recent transmission.

370
We obtained a high MTBC isolate recovery rate of 78·8% higher than that reported in similar 371 studies 28,29 and this strenghtens the power of the sample size to make assessments of TB  *Category described as untypeable for MIRU-VNTR includes isolates with ≥ 2 MIRU-loci un-amplified (164, 71·3%) and isolates with double allele at ≥ 2 MIRU-loci (66, 28·7%). These isolates were described as suspected mix infection or laboratory contamination and hence were removed from further analysis. # Frequency was expressed as the total number of Mycobacterium tuberculosis complex (MTBC) isolates obtained