Wednesday, 5 March 2014

Blockchain revelation

I downloaded a monster spreadsheet just released by John Ratcliff (see blog ref below) - created by his fast C++ blockchain reader tool - giving a load of info on the top 100,000 or so bitcoin wallets. I've done some work with excel and just produced this scatter plot. I think it shows this data source is both rich and apparently flawless.


I'l explain what it shows, as the axes aren't labelled (sorry - sloppy).
Each point is a bitcoin address. The vertical axis tells you the total volume flow for that address (in number of coins). The horizontal axis tells you the current value. So, points at the top have the highest all-time flow and points on the right have the highest current value.
The area at the bottom with no points is just because excel maxes out at 32k points. I plotted the highest flows. The very clear diagonal line is just due to the fact that the current value of the address can't exceed its total flow (the coins had to flow in at some point!). The vertical bars are there because the values are rounded to one whole coin. Note, the axes are logarithmic so things at the extremes are much bigger than they seem (the numbers of the axes represent actual numeric values).

Now the interesting initial observations:
One address has a huge flow - 15 times larger than all the rest at some 50 million coins. Its current value is just 177 coins though. Its clearly something very significant.
Many wallets have just one transaction inwards. This includes some of the biggest value wallets. These are clearly hoards/savings
Plenty of wallets have flows above 10k coins but near-zero value. These must be some kind of coin transfer or business nodes.
There is some extremely strange flow-balancing on some of the very high flow addresses - see the horizontal feature high to the left. Around 25 addresses with flows of 390k coins each - what's that all about!?
Further down (and not visible in the plot) there are innumerable one-transaction addresses with value 50 coins each. This looks like a risk reduction strategy for a massive holding. I haven't done the maths yet but this could be one or more very big hauls.

There's lots of fodder here (although I may need to start writing some python to crunch the numbers as Excel is creaking at this kind of data volume). The next thing to get hold of (based on instructions from John Ratcliff is a full blockchain transaction dump). That will be a step up in data volume (more programming!) - but should be interesting!

No comments:

Post a Comment