The COVID-19 outbreak provides a reminder of possible uses of optimisation tools for model calibration. Recall that where the values of a model input(s) are not known, these can be determined by using an optimisation method to minimise the error between the model’s predictions and actual observations.
A simple example of using optimisation to find model parameters is linear regression, where the slope and intercept of the best-fitting linear regression line can be found by minimising the sum of the least-squares difference between the observations and the predictions for any hypothesised line. One could attempt to apply similar concepts to unknown aspects relating to a model of COVID-19 infections.
With respect to cases in France, the graph above shows that the case-number growth rate is around 10% per day currently. This means that new case numbers per week are about ten times as high as they were prior to the lock-down on 17th March (56 941 new cases in the week to 7th April, compared to 5946 cases in the week to 17th March).
In simple terms, given an incubation period of up to two weeks, one would – at first consideration at least – expect the number of new cases to fall significantly, if not close to zero. Of course, there are reasons why this is not so: There would still be transmission within households, and of course in reality the lock-down is not total (“essential” shops are still open, which in France includes the local boulangeries ?, seriously!). Also, improved awareness and testing may mean that there is a higher detection rate of infections.
However, the main driver of a significant increase could also be that there were many more non-detected infections prior to the lock-down, which have resulted in new cases since.
This led me to build a small and simple model. It assumes that there are two unknown parameters: The growth rate in daily new cases prior to and after the lock-down. Other required inputs (such as incubation period, and % detection of infections), are assumed to be known and constant.
I then used Solver in Excel to find these two growth rates that minimise was the squared difference on a weekly basis in the number of new cases predicted and those which are observed. Once this is done, the absolute numbers of cases can be calculated:
|Week Ending||Observed Cases||Modelled|
As an example, the model predicts that there had been a total 130 000 infections when the lock-down started, compared to the total number of people known to have been infected of 7 730.
These modelled figures may be pessimistic: if awareness and testing has improved significantly since the lock-down (which I have no information about), then the number of cases (and growth rate) required to optimise the model parameters against observed cases would be lower, since more of the modelled cases would be detected. But a simple reading suggests also that about 1.5 m people have been infected on a cumulative basis. In fact, this tallies approximately with an alternative method to estimate the figures: as of 7th April there have been 10328 deaths, so with a mortality rate in the region of 1%, case numbers would be around 1000000.
The COVID-19 outbreak provides many opportunities to remind ourselves of the richness of the analytic toolkit available to model it, which do not need to be restricted to “risk analysis”!