Wednesday, 11 March 2020

Coronavirus Data Analysis: Very Worrying Findings

I have been digging into the coronavirus data. First of all, the easiest one to access: Worldometer,
https://www.worldometers.info/coronavirus/
A useful data source, although it only gives historic data for the total case count and the death count. In their raw form, these accumulating totals are not very informative, so I turned them into a "counts per day" form and then plotted them on a log scale. Here is the cases per day graph.



The useful thing about using a log scale is that exponential increases (a sign of uncontrolled growth) show up as straight lines. On the left side of the graph you can see exponential growth that took place in China, early in the epidemic. Then, there is a dip, as China brought their outbreak under control. On the right. On the right, we see case rate increase again as the virus takes off in Europe. I was initially reassured that this growth seems to be slowing down. But then I took a look at the death rate graph.

The notable feature is that the death rate seems to be accelerating, if anything. It is now at its highest rate ever, surpassing the peak of the Chinese wave. Was this discrepency just a glitch?

To understand more, I downloaded the Johns Hopkins University dataset from GitHub. This seems to be the best source of simple, QC'd data on Covid-19. I scratched out some Python code to do similar plots, broken down by country, and focusing on a few places of interest; especially Hubei, North Korea, Iran and some key European countries. Below are the case rate and death rate plots:

Look at the death rate graph... Most noticeable is the exponential growth occurring in Italy. Also in Iraq, although at a slightly lower rate. This shows that the epidemic is completely out of control in these countries. Now look back at the new case count graph. The growth for these countries is tailing off. So I did a cross check of deaths reported against total cases. Below is the result.

The ratio of deaths to true cases (i.e. the fatality rate) SHOULD be relatively constant. So, a high apparent death rate indicates a very poor rate of detecting cases. So, what this shows is that Italy has exponential increase in cases and deaths, but a terrible, and worsening, case detection rate via testing., almost ten times worse than the best performers. Iran are at least slightly mire under control. The UK is doing better, and Korea, who seem to have things under control now, have done best of all.

The lesson: Beware Italy and Iran. They have immense problems and no solution in sight.


Thursday, 5 March 2020

Coronavirus - why politicians and journalists should learn multiplication

Context switch. Coronavirus. Covid-19. Something tells me I am going to be blogging about it a lot in the next few months. And that something is data science and models. Disease spread models tell us, based on some assumptions, what is likely to happen next. In my other life, I have been playing with disease spread models a lot recently, under a research project, and have come to understand their ways.

Currently I am worried about Covid-19, and I believe you should be too. This is a case where being scared can save you. It can save us all. I'll explain all future posts, but here are the basic facts.

When the virus first emerged in China, it was not noticed for a while. When doctors did begin to notice an unusual increase in a particular type of pneumonia, the information was suppressed, for political reasons. Nobody did anything to slow the spread, so it spread like wildfire. Finally the Chinese government came to their senses and resolved to shut down the exponential growth in cases. They were remarkably successful, because they took it seriously, to the maximum degree. Had they not done so, the ever multiplying number of cases would have continued until most of the population had been infected. Multiplying is the key word here. Cases would have doubled every few days - doubling and doubling.

So, the Chinese got a grip, They know how to shut things down. It was remarkable.

But cases dispersed out from the core, travelling on aeroplanes to all corners of the world. And now, the growth begins again. In Iran, where the problem was not acknowledged and again there has been exponential growth. In Italy, it seems to have been spotted rather late, leading to another case of explosive growth. The lessons from China are very clear; If you cut pairwise contact down by an order of magnitude, the virus spread CAN be shut down, but the government here in the UK (for here I be|) is playing it another way. They will start to think seriously about shutting it down, once it gets big and scary. For now, they will watch and wait. This is the most dangerous nonsense. The course of the virus is already set; There are already hundreds of people incubating the virus in the UK. The growth will be exponential, and it will be harder and harder to stop, the longer this goes on.

Key messages for today:

  1. Take this seriously; it's coming
  2. Take the maximum precautions you can; minimize pairwise contact, wash your hands, eat food cooked at home, etc.
  3. Stock up on food
By avoiding catching this at all costs, you protect yourself and others.

I will talk about the models and what the data is saying in future posts. Sadly, journalists and politicians don't understand models and data science. They don't understand, so YOU need to.