I've written a little python script to analyse the big bitcoin address dataset I mentioned the other day. When I say little, I mean little (I'm new to python so it was slow going, but worth it as I learned a few basics along the way). What the script does is create a "heatmap" showing how bitcoin value is spread across the addresses - i.e. "where" the bitcoins are. To do the processing I first used Excel to save a csv file (prior to this I changed the number format to get rid of commas in the middle of decimals (which make unpacling the csv harder. Then the python script reads through each line, builds a summary and writes that out as another csv that I can read into Excel to do graphs. Its clunky but works.
So here are the results:

The first graph shows a "raw" heatmap. Its a similar idea to the scatter plot I did in a previous post. The leftmost edge of the chart shows bitcoin value in accounts with the lowest "flow" - i.e. the smallest amount of in/out traffic through the accounts. In fact the lefthand edge itself represents bitcoin with no flow at all. The area to the right of that edge shows bitcoin in accounts with progressively more and more flow. As you can see, most of the bitcoin value is on the left hand edge and there's really much less in the middle. Similarly the area of the chart nearest to you the viewer is bitcoin in low value accounts - that farthest away is in the highest value accounts. By the way, like the original scatter plot the horizontal scales on this plot are logarithmic. The vertical scale is linear so you can read off the actual number of coins involved on the vertical scale. Because of the logarithmic horizontal scale the area nearest the viewer shows tiny address and those on the far edge are monsters.
So, the analysis actually shows that 78% of bitcoin is in no-flow accounts. That is a lot. Only 22% are actively in circulation and being traded etc. This next plot looks "along" the edge - totalling up all the coins for each address activity level. Its on a log scale both ways - otherwise the peak would be overwhelming. This just illustrates the issue. The big peak on the left is bitcoin in accounts with no outflow. Ten million coins or thereabouts - 78% of total value.
Do lets take a closer look at the left hand edge. The next plot is a line graph that is basically that 78% of un-traded bitcoin addresses totalled according to how large a holding the addresses contain.
The left side of the graph shows tiny accounts - the right side shows big ones. As you can see, there is a big spike (we're on a vertical linear scale here so the spike is actual size in terms of coin value... ignore the x axis on the graph as it is not meaningful). The big spike represents dormant bitcoin packaged in 50 coin addresses. I assume these are "raw" mined blocks that are being stockpiled. There are other spikes that suggest there are raw blocks in a range of sizes sitting in cold storage (as well as other small-to-big time savings accounts). The no-flow 50 coin addresses alone contain nearly 20% of all bitcoins in existence.
Okay - one more chart. Here is a plot similar to the above one, but this time plotting the first used time in days before present against the address value size: Here Tiz:
You'll notice the spike again - at the same x axis value. What that's saying is that the 50 coin addresses are very old - it suggests an average of 1500 days. A look at the raw data confirms that these addresses are indeed very old. They were mined and left in 50 coin blocks - one block per address. They start on 9th Jan 2009 (a week after Satoshi's very first block) and continue at quite a pace for a couple of years. It looks like this must be either Satoshi's holding or one or more of the original inner circle. Strangely the oldest address of all - 3rd Jan 2009, which HAS to be Satoshi's andoriginates in the "genesis block" appears to have been very active - 65 coins value, 891 transactions - last transaction on 17th Feb this year. Tantalizing glimpses.
Of course the big advantage of keeping all these bocks in separate addresses as they have been is that they aren't linked together under one owner. It's impossible to trace any inter-relationship or reconstruct the underlying social network... you can't tell whether they belong to one individual, a few of the inner circle or a whole bunch of people. But this pattern does suggest a single origin.
I'm going to "mine" deeper into this data - but this set of plots clearly shows the bitcoin "whales" under the surface. When they will break cover is an interesting question.
BTW if you want my python script, lemme know.