Pooling of sputum samples to increase tuberculosis diagnostic capacity in Brazil during the COVID-19 pandemic

Objectives We assessed whether combining (pooling) four individual's samples and testing with Xpert Ultra has the same accuracy as testing samples individually as a more efficient testing method. Methods We conducted a cross-sectional study of individuals with presumptive tuberculosis attending primary health care or general hospital facilities in Alagoas, Brazil. The sputum samples of four consecutive individuals were pooled and the pool and individual samples were tested with Xpert Ultra. The agreement of the tests was compared using kappa statistics. We estimated the sensitivity and specificity of pooling using the individual test as the reference standard and potential cartridge savings. Results A total of 396 participants were tested. A total of 95 (24.0%) individual samples were Mycobacterium tuberculosis (MTB)-positive, 300 (75.8%) “MTB not detected”, including 20 “MTB trace”, and one reported an error. A total of 99 pools of four samples were tested, of which 62 (62.6%) had MTB detected and 37 (37.4%) MTB not detected, including six (6.1%) with MTB trace. The agreement between individual and pooled testing was 96.0%. Pooling had a sensitivity of 95.0% (95% confidence interval 86.9-99%), specificity of 97.1% (95% confidence interval 85.1-99.9%), and kappa of 0.913. The method saved 12.4% of cartridge costs. Conclusion The pooled testing of specimens had a high level of agreement with individual testing. The pooling of samples for testing improves the efficiency of testing, potentially enabling the screening and testing of larger numbers of individuals more cost-effectively.


Introduction
Tuberculosis (TB) is second only to COVID-19 as a cause of adult death due to infection. Although TB is ubiquitous, its distribution is not even, and 30 countries, including Brazil, account for 86-90% of the global TB incidence [1] . Under-reporting of TB is a major 14.3% lower than in 2019, which was accompanied by a 14% reduction in the use of rapid molecular tests [3] .
Access to TB treatment depends on good quality diagnosis and the World Health Organization (WHO) recommends using nucleic acid amplification (NAA) assays as the initial tests for diagnosis. Although these tests are sensitive and specific, with Xpert MTB/RIF and Xpert Ultra being the most frequently used assays [4] , only a fraction of the individuals are tested with these assays because their implementation is limited by the laboratory infrastructure required, and the cost of the cartridges.
Xpert assays were approved in Brazil in 2013 and are indicated as the first tests for diagnosis of TB within the Brazilian National Health System. Cartridges are provided by the Ministry of Health, and laboratory stocks are based on the estimated population they serve. Clinical specimens collected by the clinics are transported using sample transport networks and testing is centralized in reference laboratories. Health services, however, became severely strained during the pandemic, and although there have been no reports of cartridge shortages, the workload of TB and SARS-CoV-2 tests is high and often requires extended testing over the weekend.
Recent studies have reported that the sputum pooling method, in which samples from several patients are combined and tested together, could increase the efficiency of NAA TB assays [ 5 , 6 ]. However, there are no data on the performance of this method from presumptive TB in primary health care in Brazil.
Therefore, we evaluated whether combining specimens of four individuals with presumptive TB and testing the pool with Xpert Ultra would result in the same accuracy as testing samples individually and estimated whether the pooling approach would result in cost savings.

Methods
This was a cross-sectional survey of consecutive individuals attending primary health care units or general hospitals with signs and symptoms of presumptive TB in Alagoas state, Brazil from September 2021 to February 2022. Adults with presumptive TB were requested to produce two samples of expectorated sputum for examination, following the Brazilian routine procedures for TB diagnostic centers. Sputum samples were kept refrigerated and transported to the testing laboratories daily, or batched in consignments and submitted every 2-3 days, depending on the local availability of transport. Samples were transported using a cold chain with cold boxes until tested. Sputum samples were routinely processed in the laboratory and tested using Xpert Ultra, following the manufacturer's instructions with a 1: 2 sputum to reagent ratio. Samples with at least 0.5 ml of leftover sputum were selected for the pooling study. We combined (or pooled) the sputum of four consecutive individuals into one pot and tested the pool with a single Xpert Ultra test. Individual sample results were used for the purpose of evaluating the performance of the approach and for modeling potential savings. Individual and pooled Xpert tests reporting invalid, error, and no result were repeated, if there was enough sample left for testing, and the repeated test result was included in the analysis. Samples with trace call results on individual tests were retested if sufficient sample was available and are described with all results to support interpretation.

Statistical analysis
We conducted the pooling assessment during the period of the COVID-19 epidemic. The sample size for the survey was not formally estimated because we were limited by the expected number of participants attending the services and the capacity of the staff to conduct the testing in addition to their routine activities. All data were stored in anonymized databases compliant with data protection legislation. Categorical data were summarized using descriptive statistics with 95% confidence intervals (CIs). Chisquare tests and chi-square for trends were used to test for statistically significant differences. Individuals unable to produce sufficient sputum were excluded from the study.
The pooled and individual tests were compared, and their agreement was tested using kappa statistics. We considered concordance if (i) the pool result was negative and all tests for the four individual samples were negative, and (ii) the pool result was positive and at least one of the tests for the individual samples was positive. The kappa values and their interpretations were as follows: < 0, no agreement; 0-0.19, very weak agreement; 0.20-0.39, weak agreement; 0.40-0.59, moderate agreement; 0.60-0.79, substantial agreement; and 0.8-1.0, excellent agreement [7] . The Mycobacterium tuberculosis (MTB) grades (trace, very low, low, medium, and high) of individual and pooled tests were compared to describe the effect of combining the samples. Patients with trace results were retested if there was sufficient sample left but were considered to be test-negative because WHO recommends not to retest and not to consider them as positive, unless considered with additional clinical findings and medical history [8] . We present trace results as a separate category for clarity and were considered negative for the purpose of the test agreement. The sensitivity and specificity were estimated using the single Xpert Ultra test for a single sputum sample. The individual test was considered the reference standard. The cost differences were calculated on the bases of the number of cartridges required to test all specimens using pooled and individual testing, assuming a cartridge procurement cost of $9.98 [9] .

Ethical approval
The study was approved by the Committee on Ethics in Research with Human Beings at the Universidade Federal de Alagoas, Brazil (CAAE number 45432821.2.0 0 0 0.5013) and the Liverpool School of Tropical Medicine Research Ethics Committee, UK (Ethical waiver 20-037). A written informed consent waiver was obtained from the participants.

Role of the funding source
The study sponsors had no role in study design, data collection, data analysis, interpretation, writing of the report, or in the decision to submit the paper for publication.

Results
A total of 396 participants with a mean (SD) age of 49 (16.9) years were included. Of these, 252 (63.6%) were male and 144 (36.4%) were female. The largest proportion of participants (152, 38.4%) were aged ≥55 years and the smallest proportion (89, 22.5%) were aged < 35 years, as shown in Table 1 . The samples were considered of good quality for most participants; only 15 (3.8%) contained saliva and 11 (2.8%) contained blood traces. A total of 28 individual samples had "MTB trace" results and 11 had sufficient volumes for retesting. Three of the retested samples were "MTB detected" (one each with very low, low, and medium MTB grades), five MTB-negative, and three were, again, reported as MTB trace. All 20 samples reported with trace results (those not retested and retested samples reporting a repeat trace result) were considered "MTB not detected" for the analysis. A total of 15 samples reported errors, with four reporting MTB detected on retesting, 10 MTB not detected, and one a repeated error. A further six samples    Individual samples were tested in 99 pools of four. A total of 62 (62.6%) pools had MTB detected and 37 (37.4%) MTB not detected, with the latter including six (6.1%) pools with MTB trace. One pool was rifampicin-positive ( Table 2 ). The individual and pooled tests were in agreement, except for one pool containing four individual MTB-negative samples, which tested MTB-positive, and three pools containing one MTB-positive sample on individual testing, which tested MTB-negative. All 26 (100%) pools containing two or more MTB-positive samples tested positive ( Table 3) . The overall agreement was 96%, with a sensitivity of 95% (95% CI 86.9-99%), specificity of 97.1% (95% CI 85.1-99.9%), and kappa of 0.913 ("near-perfect agreement") ( Table 4 ). The agreement of individual and pooled tests was associated with the MTB grade. A total of 38 pools contained only one MTB-positive sample. The pool Xpert grade was lower than the individual grade in 13 pools, similar to the individual grade in 20 and higher in five, as shown in Table 5 .
The testing samples individually required 396 Xpert Ultra cartridges at a cost at source of $3952.08. Testing the samples using the pooling method required 99 cartridges ($988.02) to test the pools and 248 cartridges ($2475.04) to retest individual samples for the positive pools (total cost $3463.06), resulting in $489.02 (12.4%) savings in cartridge costs.

Discussion
This is the first evaluation of the pooling method for the diagnosis of TB in Brazil, a high burden country where NAA assays are used as the first test for diagnosis. Participants in the study reflect the characteristics of individuals with presumptive TB in the country, with a higher proportion being male and a higher proportion of males having positive tests than females. Our findings confirm that samples tested with Xpert Ultra using the pooling method have a high level of agreement with individual testing, as previously reported in Cambodia [10] and Laos [6] . Almost all disagreements were false negatives (3%) and occurred in samples with low MTB grades and we had only one pool containing negative samples that tested positive. Although false negatives are mostly due to low bacilli loads and the limitations of the tests to detect paucibacillary TB, some studies have reported false positive results during pooled testing [ 11 , 12 ]. Although false-positives are usually attributed to cross-contamination resulting from the additional manipulation of samples, pooled testing for other pathogens ( e.g., for SARS-CoV-2)  has been reported to result in reduced cycle threshold values of the pools. This effect may be due to polymerase chain reaction (PCR) efficiencies through a "carrier RNA" effect caused by the increased total cellular RNA in the pool or improved PCR efficiencies in samples containing PCR inhibitors, which are diluted by the pooling process [ 13 , 14 ]. False positive pooled results will lower potential savings but would not affect final clinical decisions because the samples in the pool would be tested individually.
As expected, some individual samples (n = 28, 7%) had trace results. Trace results are known to have low repeatability and WHO recommends not to retest these samples. This recommendation is based on the difficulties in interpreting the repeat results in patients with a history of disease and a high false-positivity rate [8] . A small proportion of samples (n = 11) were retested to try to obtain a positive or negative result and the retesting resulted in only three samples being reported as MTB-positive, five as MTBnegative, and three gave a second trace result. Thus, our findings are in agreement with WHO guidelines and with studies in high burden countries, where it is expected that a variable but low proportion of trace results are confirmed culture-positive and patients should undertake further tests and examinations [15] .
Our study has some limitations that need to be considered for the interpretation of the results. We used the individual Xpert Ultra test as the reference standard. Although this is not a perfect reference standard because culture is the accepted reference standard, which has a higher sensitivity, we decided to use Xpert Ultra test as the reference standard because sputum culture was not available in the study setting. In addition, most recent publications have used a similar approach. Essentially, the comparison of pooling with single testing describes the sensitivity of pooling against single testing. The efficiency of the pooling method depends on the proportion of samples that are positive. If the proportion positive is low, few of the pools need to be retested, whereas if most pools are positive, retesting a high proportion of the pools negates its potential advantages. We expected that only 10-15% of the samples would test positive, reflecting prepandemic testing patterns. However, samples collected during the epidemic, when access to health services was limited due to movement restrictions and laboratories were under severe strain, resulted in a high proportion of samples testing positive. This unusually high proportion limited the number of cartridges that could be saved, and although savings amounted to 12%, this is lower than the 30-50% savings reported from other settings and smaller pools of 2 or 3 samples per pool could have resulted in higher savings. In addition, we only analyzed the agreement of Xpert tests. This approach would miss samples with culture-positive Xpert-negative samples with low bacilli concentrations. Because the agreement is dependent on the sensitivity of the test, the agreement will likely vary with the proportion of paucibacillary samples in the study population, which may explain why the studies in Laos [6] reported a higher level of agreement. Lastly, we tested all individual samples and compared their results to pool testing, which was needed to describe the performance of the test. However, this approach precluded us to eval-uate the staff acceptability of the method and real-life time savings when applied under operational conditions. TB continues to be a major cause of death and long-term morbidity worldwide, and a high proportion of individuals with TB are missed by health services [1] . Although it is recommended that individuals with presumptive TB should be tested using NAA assays, implementing this recommendation at scale has been difficult because most individuals attend primary health care centers with limited laboratory capacity. The most commonly used NAA platform is the four-module GeneXpert, which requires uninterrupted electricity and air conditioning, which confines it to higher level laboratories and generates the need for sputum transport. Alternative battery-operated platforms, such as GeneXpert EDGE, are promoted as a point of care device. Although this one-module platform processes one test at a time and its throughput is insufficient for busy primary health care clinics, the use of pooling would allow the testing of more patients, whereas local testing would reduce sputum transportation costs. Moreover, the number of Xpert cartridges procured worldwide is insufficient for the number of individuals that should be tested. Most TB diagnostic centers require testing between five and 10 individuals with presumptive TB to confirm one person with TB and therefore, this would have required testing 50 and 100 million individuals to identify the 10 million individuals with TB reported in 2020 [1] . Given that only 15.4 million cartridges were procured in 2020 [5] , the global supply of cartridges is a fraction of the number needed. Clearly, there is a need to identify methods that allow testing more patients with a limited number of cartridges and pooling could play a role by increasing the efficiency of testing.
Our study adds to the increasing body of evidence that pooling of samples for Xpert Ultra testing improves the efficiency of testing, potentially enabling the screening and testing of larger numbers of individuals more cost-effectively. We have shown that, as in previous studies, individual and pooled testing of samples have a high level of agreement. The pooling method has the potential to increase the number of individuals tested for TB at the local level by both increasing the number of individuals that can be tested with a limited number of cartridges and the throughput of batteryoperated platforms. Moreover, these efficiencies could be achieved while reducing the cost of testing through a reduction of sample transportation and the number of cartridges required per patient. Further implementation studies are warranted to tests these approaches on large scale.

Declaration of competing interest
The authors have no conflicts of interest to declare.

Funding
This research was funded, in part, by a TB REACH grant supported by Global Affairs Canada (STBP/TBREACH/GSA/2020-04) and the UK Medical Research Council Public Health Intervention Development (MR/W004313/1) to LEC.