COVID-19: How to make between-country comparisons

Highlights • Direct comparison of raw data of the COVID-19 epidemic between countries is not possible.• This study presented methods that allow for a direct comparison of the development of the epidemic between countries.• These methods were applied to countries with different containment strategies.• The results confirmed that early containment is key in flattening the curve of the epidemic.• Most European countries seemed to be containing the epidemic, the USA seemed posed for an explosive accumulation of COVID-19 -related deaths.


Introduction
Since the start of the outbreak of SARS-CoV-2 in December 2019, in the Hubei province in China, the virus has quickly spread across the world (Ahn et al., 2020;Bar-On et al., 2020;Dyer, 2020). As the virus spread, so did the COVID-19 disease that it causes. To curb the surge in COVID-19-related mortality, different governments enforced different measures for the containment of the pandemic (Yan et al., 2020;Pike and Saini, 2020). It is difficult to compare numbers of cases between countries because of the vast differences in testing policies. Now, as the pandemic claims more lives worldwide, the accumulation of mortality can be compared between countries to obtain some insight into the effectiveness of the different containment measures (EU, 2020;Petropoulos and Makridakis, 2020). However, a direct comparison of crude rates between countries will be biased, even for mortality. The current study proposes methods to enable a comparison and presents the results of this comparison.

Data
Reported numbers of cases and deaths per country were obtained from the European Union Open Data Portal, where data on worldwide numbers of reported cases and numbers of reported deaths for the COVID-19 pandemic are updated daily (EU, 2020). Numbers of reported cases and deaths between 01 January and 17 April 2020 were compared between countries.

Comparability of data between countries
The comparability of data between countries was increased in two distinct ways. First, the start of the epidemic was synchronised between countries by using the date of the first reported COVID-19 case or COVID-19-related death as the index date. Second, the size and susceptibility of the population and the probability of a COVID-19 case or a COVID-19-related death being reported as such were all corrected for in a single procedure. All cumulative numbers of cases or deaths were normalised to a reference number. As a reference, the cumulative numbers of cases or deaths on day 25 of the synchronised epidemic were taken (i.e. day 25 after the index date for each country).

Sensitivity analyses
Day 25 was chosen as the reference day because, in most countries, by day 25 after the first case or death the epidemic had established itself and the number of cases or deaths had increased to a level where random fluctuations were reduced to an acceptable level. To assess the potential influence of choosing day 25 as a reference, sensitivity analyses were performed repeating all analyses, while taking days 20 and 30 as the references.

Visual representation and categorisation of countries
After synchronising countries by the date of the first death in each country, cumulative numbers of deaths were expressed as percentages of the cumulative number of deaths on day 25 for each country. Resulting percentages were expressed in graphs and plotted against synchronised time. Temporal trends in cumulative numbers of deaths were compared with those for China, where the pandemic started, and where the temporal trends have therefore developed the furthest. For comparison with China, countries were divided into three categories. First, countries with a policy similar to that of China. These are the European countries, where governments waited for the epidemic to establish itself, but not for substantial numbers of COVID-19-related deaths to occur, before taking preventive measures. Germany, Italy, the Netherlands, Spain, and Sweden (alphabetic order) were used as examples, but graph shapes for other European countries were rather similar. Second, in South Korea, strict preventive measures were put into place even before the virus substantially spread in the population. Third, a comparison was made with the United States of America (USA), where preventive measures were not put into place until large numbers of deaths had already occurred.

Results
As shown in Figure 1, the temporal development of the epidemic appears very different between different countries in panels A to C, but not in panel D. When comparing the number of cases per 100,000 inhabitants (panel A), both the absolute values and the timing were different between countries. Comparing the number of deaths per 100,000 inhabitants normalises the timing somewhat, but not the absolute values. Comparing the number of cases, as a percentage of the number on day 25 after the first case (panel C) normalises the absolute numbers somewhat, but not the timing. Onlycomparing the number of deaths as a percentage of the number on day 25 after the first death (panel D) allows a direct comparison between countries. This comparison was therefore used for all further comparisons between countries. Figure 1 panel D compares a number of European countries with China. As can be seen from panel D, the epidemic followed the natural development, which was almost identical to the April, corresponding to about day 30 in most European countries). Panel D further shows a minor flattening of the curve for most European countries between 05 April and 17 April. The two countries with the most extremely developed epidemics in Europe were Italy and Spain. Spain had the most extreme flattening of the curve, while this flattening was almost completely absent from the Italian curve. Figure 2 shows the temporal development of the epidemic in South Korea, which was much more gradual. Finally, Figure 3shows the development in the USA, where the epidemic developed much more rapidly.

Sensitivity analyses and supplemental material
Sensitivity analyses, using different reference days, produced very similar results. The supplemental material includes alternative versions of all graphs shown in all Figures. All graphs in the supplemental material are shown both until 05 April and until 17 April and with day 20 and day 30 as reference dates. Absolute differences between countries are more pronounced when using an earlier reference day (i.e. 20 instead of 25) and less pronounced when using a later reference day (i.e. 30 instead of 25), but overall conclusions are unaffected.

Discussion
These results clearly show that comparing numbers of cases or deaths per 100,000 inhabitants falsely suggests huge differences between countries. Using the number of cases expressed as a percentage of the number of cases on the 25th day after the first case provides a more consistent estimate of the affected proportion of the population at risk. However, due to large chance variation in the detection of the first case, synchronisation of the epidemic  between countries is still poor. Using the number of deaths expressed as a percentage of the number of deaths on the 25th day after the first death provides the best direct comparison between countries.
Using this observation to further compare different countries, a clear difference was observed in the development of the COVID-19 epidemic between countries with different containment policies. In most European countries, the early stages of the epidemic seemed to have a temporal development very similar to that in China. The curves flattened about 3 weeks after the implementation of strict containment strategies. Except for the Italian curve, which continued to follow, and possibly even exceed, the Chinese one. A possible explanation could be that containment measures were taken too late in Italy. Italy was the first European county to be affected and the pandemic was therefore recognised relatively late. This could also be in line with the results from South Korea, where very early containment measures prevented the initial exponential development of the epidemic, which was seen in China and all European countries. If this explanation is correct, this could be worrisome for the USA, where containment measures lagged behind since the start of the epidemic. Indeed, an explosive development of the epidemic in the USA was observed and is already far beyond the development observed in China.
To appreciate these results it is important to note that data from different countries are not directly comparable for at least five distinct reasons. First, the virus did not simultaneously arrive in all countries, causing a desynchronised development of the epidemic in different countries. Second, absolute numbers are incomparable due to different population sizes. Third, rates per 100,000 of the population are incomparable because not all countries are homogeneously affected. Especially in the larger countries, like China and the USA, epidemics can be (temporarily) focused on a localised level. For example, in China, the province of Hubei was severely affected, while the rest of the country was not. Therefore, correction for the total size of the Chinese population would not provide a representative figure. Of particular note, in panels A and B of Figure 1 the numbers of cases and deaths in China disappeared almost completely; this was due to false inflation of the denominator. Fourth, susceptibility to death by COVID-19 can differ between populations, depending on the demographic composition of a country's population; for example, in Italy, older people are known to be relatively overrepresented in the population and more likely to live in a single household with relatives from a younger generation, causing increased numbers of elderly to be infected and therefore relatively more COVID-19 mortality. Fifth, a death during the COVID-19 pandemic is only reported as a COVID-19-related death if the patient was diagnosed with SARS-CoV-2 infection; therefore, differences in testing policy and guidelines for clinical diagnosis (i.e. in the absence of laboratory testing) will also cause differences in estimated numbers of COVID-19-related deaths.
The first problem was addressed by choosing an appropriate index date for each country and setting this date to day 1, for the start of the epidemic in that country. As an index date, this study choose the date of the first reported COVID-19-related case or death in each country, depending on whether cases or deaths were being synchronised. Admittedly, chance processes play a role here, causing some uncertainty in determining the index date. This was especially clear for the date of the index case (Figure 1 panels A and C). Synchronising the development of deaths by the date of the first death was much better (Figure 1 panels B and D). The remaining four problems all pertained to the size and the susceptibility of the population, or the probability of a COVID-19 case or COVID-19related death being reported as such. Adequate control for all factors influencing these problems is a practical impossibility. Therefore, this study chose to normalise the cumulative number of deaths by a reference number of deaths. The number of actually reported COVID-19-related deaths is clearly a direct function of the size and susceptibility of the population and the probability of a COVID-19-related death being reported as such. Therefore, taking the reported number of COVID-19-related deaths on a synchronised reference date as a standard simultaneously corrects results for all these factors.
In conclusion, although the future development of the pandemic remains difficult to accurately predictdue to changing containment policies, changing seasonal influences (Neher et al., 2020), and the possibility of a depletion of susceptibilities or the development of herd immunity (Kwok et al., 2020;Tang et al., 2020) current data suggest that the USA should expect an explosive increase in cumulative mortality due to COVID-19, with containment policies still lagging behind, while most European countries seem well on the way to containing the pandemic.