As many of my readers know, I have accepted the position of research group leader at the Complexity Science Hub (CSH) in Vienna (and I continue as University of Connecticut professor, so right now I am in Connecticut).
A week ago the Austrian government asked CSH to conduct research that would help formulate better policies for dealing with the Covid-19 epidemic. As an aside, I find it incredibly refreshing that a national government would actually ask scientists for help (and have a research institute ready to provide such assistance). In any case, CSH decided to put other research on hold and redirect all of its scientific power to deal with the challenges that the Corona crisis poses to our society.
As a result, last week I have been contributing to a working group that asked the following question: how effective are various public health measures in slowing down, or even stopping the spread of the Covid-19 epidemic? There is quite a lot of variation in how different countries have decided to deal with Corona, ranging from highly Draconian measures implemented by China to (at least, initially) laissez-faire approach of United Kingdom. Of particular interest could be pairwise comparisons between such similar countries as Denmark and Sweden, which have adopted very different Covid policies.
The goal of my research last week, thus, was to estimate the effects of various measures implemented by national governments to slow down and reverse the spread of the Covid-19 epidemic. A direct approach to answering this question is to track the growth rate of epidemic (with the severity of epidemic estimated by the number of “Active Cases”, that is, the number of people known to be currently infected) and then observe how various interventions affect this growth rate. For example, one could use the “basic reproductive number” (R0) — the average number of cases directly infected by a sick individual. The goal of an intervention is to bring down the reproductive number below 1, which will result in the epidemic dying out.
However, one potential problem with such direct approaches is that a large proportion (at least 50%) of people infected with Corona are “asymptomatic”, meaning that they themselves don’t know they are carrying the disease. As a result, they are not included in the disease statistics. Even worse, the proportion of known infected individuals tends to increase with time, as people become more aware of the epidemic and governments may decide to massively test asymptomatic individuals. Changes in the disease detection rate will, then, tend to mask changes in the epidemic growth rate.
So what can be done? My idea is that we should model both the dynamic process of how disease grows (and eventually declines) and how the numbers of actual infections are translated into official statistics. If we have a good process model (and for epidemics, we do) then an analysis of data based on such a mechanistic model will work better than using a purely data-driven approach. The reason is that we can build into the model what is known which enables us to efficiently use the precious data to estimate what is unknown.
For the technically minded I posted a document that describes exactly what I have done (GitHub:pturchin/Covid19). But in this blog, I will simply illustrate the approach with one specific example using non-technical language, which (hopefully) should be understandable to all readers (ask me questions in comments, if anything is unclear).
Important disclaimer: these results are quite preliminary and should be taken with a grain of salt. I often use my blog to air new ideas in order to find out any problems with them at an earlier, rather than later stage. And I certainly don’t speak for any organization (including CSH) or a government.
The illustrating example I use is the Covid-19 epidemic in South Korea. First, let’s look at the data. The chart below shows the progression of the disease, as measured by the number of “Active Cases” (people known to be infected).
Next, let’s see how fitting a model to these data can clarify the internal mechanics of the epidemic. I use a variant of a standard epidemiological model, known as SIRD (so named for the first letters of the variables it tracks: the numbers of Susceptible, Infected, Recovered, and Dead). We want to make sure that the model does a good job approximating a variety of different angles from which an epidemic can be viewed. The next series of charts show whether the model succeeds in this. Points are the actual data, while curves depict model predictions.
In fact, the model does a very good job. This increases our confidence that it has captured the essential mechanisms driving the epidemic. And we only need to add two additional features to the basic SIRD model to do this.
The key parameter in the model is the transmission rate, which determines how fast the disease spreads from the infected population to that of susceptibles. The second important parameter is the detection rate. Both of these parameters changed during the epidemic. As is well known, once the South Korean officials realized that they have an epidemic to deal with, they massively expanded their testing program and imposed vigorous quarantine measures. These measures should have increased the detection rate and decreased the transmission rate. Building these changes into the model, we can estimate when and how much these two rates changed. Here’s what I got:
Panel (a) shows how the transmission rate (beta) changed with time. Initially the infection rate was very high, with the exponential rate of increase of around 0.4 day–1 (in other words, every day the number of infected increased by 40%). This parameter began declining after day 25 (mid-February), but reached low levels only close to day 50 (early March).
Panel (b) shows the detection rate. It is estimated as 0 until day 30, which suggests that initially, and for quite a while, the epidemic was growing “below the radar screen”. People were getting sick in growing numbers, but the society as whole was not yet aware of it.
South Korean authorities started testing for Covid-19 in early February, and the scale of testing was massively expanded after Feb. 20, which closely corresponds to day 30 when model-predicted detection rate began increasing. Eventually it reached a very high level of nearly 70%, suggesting that aggressive testing of asymptomatic people is bearing fruit.
Overall, then, this analysis of South Korean data makes a lot of sense in light of what we know about the course of the epidemic there. There are some caveats, which I discuss in the technical document, but the model fits exceedingly well and provides us with numerical estimates of the effectiveness of the measures taken by the SK government. The intervention was highly effective.
A future post will report on the analysis I’ve done for China. The situation there was more complex, and the model fit was not as excellent as for the SK epidemic. But it still yields very interesting and instructive insights. Stay tuned.
Added (21:00 23.III.2020): I have posted the document providing technical details and the R-script on my GitHub directory