Thursday, 9 October 2014

MtGox - Time Line

A couple of days ago I dusted off some bitcoin trace data I produced a few months back and decided to look at other ways of processing it. I'm interested in what time correlation could tell us about the bitcoin transaction ecosystem. As a very quick starting point I decided to time-bin the transactions following the MtGox bitcoin show (see my previous post).

I modified my python post-processing routine to count the coin flow and the number of transactions in each of my chosen large time steps (5000 seconds) working from the very first transaction after the show point. Doing this I felt a bit like an astronomer analysing the ancient light from just after the big bang (ridiculous thought). Its pretty cool that this data is frozen forever for us to examine.

Here's some graphs I made by exporting csv and plotting in excel...


Here's the first graph - number of transactions per step. You can see the rate ramp up from near zero in the first step up to hunderds of transactions per step. Note, the x axis is time step number. The steps are 5000 seconds - around an hour and a half (daft choice - I should have used hours shouldn't I :?)



If we look at the value transfer rate (above graph) we see a different story. As time goes on, the actual values transferred fall sharply over time. That's because the transations themselves are getting smaller as the coins get more and more split up. Note the log scale!


So by dividing the volume by the transaction rate we get the mean size - this falls form the initial HUGE transaction steeply down. Note, I'm not considering dilution as I'm not quite sure how to handle it. probably  mean size would continue to fall over time if I looked at the amount of coin that really originated from the original pot as some of the later transactions might be getting quite dilute.

Here is the interesting bit though. I looked again at the first graph and saw what seems to be an outlier... The big spike near the start. So I replotted the very early parts - here's what I got...



There's a huge spike of transactions on around timestep 28. A leap up to nearly 900 transactions and then it drops back again. Clearly something happened here. What I'm thinking is that someone ran a script to disperse the coins widely. This probably makes the trail really hard to follow if you do it manually. Fortunately we have computers so its easy!

Now what I'm wondering if this spike corresponds to the strange whirlpool structure on my Gephi coin trace... I think that whole structure may have been formed very quickly by running a script. I'm now trying to imagine what the script code to do that would have looked like and exactly why it was done in the way it was. Next step is to add a time line to the Gephi trace and step through it. That'll let me see when that structure appears on the graph view. I will keep you posted :) 

Friday, 25 July 2014

The Open Source Personal Meme Filter

Memes to avoid:
Memes that people are out to get us
Memes that sections of society are out to get us
Memes that we (the guys in the white hats) are superior to others
Memes that the people of other countries are inferior, dislike us, are plotting against us
Memes that being unkind to people is fun or cool
Memes that its them or us
Memes that to attack the meme is wrong, that to question a meme is wrong, or reject a meme is wrong
Memes that society has been bad to us and we have a right to take revenge
Memes that deny us our free will - our right to choose our path
Memes that justify the unjustifiable

Fear memes, superiority memes, memes that surreder our free will. memes with their own meme defences. Complexes of mutually supporting memes with no substance (houses of meme cards)

In short - memes that bring us fear and give us licence to do bad things to people

Rules: accept only memes that are free standing, testable, cause nobody any problems, bring good to people, and most of all that you believe in

A further observation: This personal meme filter is itself a meme - hopefully one that passes its own acceptance rules. Its okay to attack this meme - but hopefully it is strong.

An open source meme: This meme is published under an LGPL licence. Please feel free to build on it, but do share your results with us all. Together maybe we can build a better world.

Wednesday, 16 July 2014

Bitcoin - Digital Fortress?

I just thought I'd float this idea: In his science fiction novel of a few years ago Digital Fortress, Dan Brown describes a supercomputer built by the NSA to crack complex codes like public-private key encryption. We know now from the Snowden revelations that they've been doing a lot of funky stuff - some involving code cracking - some more focussed on back dooring, intercepting and hacking.

But Bitcoin - this technology of unknown origin... It involves an ever increasing amount of computer power computing hashes at an exponentially increasing rate. Vast bespoke computers of unprecedented power are now dedicated to this. Its driven phenomenal advances in bespoke computer power - pushing advances in GPU and bespoke ASIC hardware. I guess the two questions that spring to mind are:
1) Who is really behind bitcoin? Anarchists? Bankers? Con Men? Aliens?... The NSA???
2) Could we have helped (lured by a few riches) build a vast Digital Fortress style global computer to help our government security guys crack the most complex codes. I dunno - is this eveb possible?

Wednesday, 2 July 2014

The Meme Meme

Thought for the day: The term meme was coined by Richard Dawkins in his book The Selfish Gene. A meme is an idea. Ideas exist within the population (who are their hosts). The point he was making was that strong ideas live and multiply and spread - just like strong (or perhaps more aptly, fit for the current environment) genes.
The idea of a meme is, clearly, in itself a meme. the meme meme (or meme2).
Interestingly common use of the term deviates from the original. It has evolved into something new. The meaning in most quarters is now this. Grotesque image plus big white writing plus bad grammar equals meme. Simples.
But the real meaning is far more interesting. Currently our world is populated by some powerful and dangerous memes. I won't put specific names to them here. There are several - and they generally follow the same pattern. "I believe that my set of ideas is the only correct one - believers in all other sets of ideas are therefore by definition wrong". I disagree with this wholeheartedly. Many sets of ideas can get you through life very nicely and do nobody any harm, and hopefully do some good. There are only a few no-no's in my view, but complete intolerance of a diversity of views, cultures, physical characteristics etc. is definitely high on the list of bads. But why so many bad memes? Compare back against the gene example on which the meme meme is based and you immediately see the answer... Because they thrive in the current environment. That's the thing we've got to understand and do something about... What is it about the current environment that allows these quire damaging memes thrive? The environment is pretty much us, the human hosts, or human society. We need to take a serious look at that.

Tuesday, 1 July 2014

MtGox and Instawallet AGAIN!?

I really should move on from speculating about MtGox and the murky world surrounding it. But its HARD, with so much still obscure.

In a previous post I showed how Instawallet seems to show up prominently on the transaction trail after the MtGox dog-and-pony-show (by which I mean their "show of coins" back when). Anyway, I just found this thread dating back to when Instawallet nearly folded and got taken over. An interesting piece of history on several fronts. But check out post #19 from Mystery Miner. He seems to be reporting some wierdnes whereby when he tired to use Instawallet to buy ganja on Silk Road, the ganja was actually purchased with apparently stolen coins, while his actual coins got used to purchase a "killer wet job". I have know idea what that is, but that aside it does seem he is reporting some kind of Gox-Instawallet fund confusion or interchangeabilit . although how reliable his evidence is though may be called into question by the fact that he may be judged from his post to be:

a) a self-styled shadowy enigma and international cyber-man-of-mystery
b) a self-proclaimed buyer of illegal drugs and frequenter of the deep web
c) maybe just a bit of a sleaze

... he may have actually been onto something there though!

Monday, 30 June 2014

BitIodine Review

I just found this online tool called BitIodine. I've just had a look - it claims to do clustering etc and I was quite excited they were doing something a bit like my Gephi stuff. But it seems to be simple text-based stuff. I'm not terribly impressed - surely someone must be doing something more exciting out there?

Wednesday, 4 June 2014

Bitcoin 2.0

Its amazing how fast things move in the new world that is crypto-currency (Year 0 was only 2009). A couple of months ago we had the collapse of MtGox and others, a few months earlier it was the Silk Road seizure. Now suddenly things are looking more mature (I guess a shake-out of the weakest links HAD to happen), the market is up (bitcoin was at $660 today) and some good innovations are going on.

One of my favourite examples of bitcoin doing good things for the world is BitPesa. This allows you to send money to anyone in Kenya (and other parts of east Africa) buy buying bitcoin and sending it through the BitPesa gateway. Basically you can send real money to anyone who uses the M-Pesa system - a neat pay-by-text and microfinance system run by Safaricom and Vodacom - the mobile phone giants from Kenya and Tanzania. Mobile phone usage is MASSIVE in East Africa and M-Pesa is really huge there (see article), so that's a lot of people you can send money to instantly. PitPesa is run by an American ex-pat living in Nairobi so its obvious they undertand the opportunity and what it could achieve. I hope more of this kind of thing happens as it could open up opportunities for a lot of people in a beautiful part of the world.

I found this link with a movie of the world's largest bitcoin mining corp. They're using big banks of ASICs, each controlled by a raspberry PI. The head guy says his profits are great, but I did the maths and worked out (using a standard bitcoin calculator) that if you buy one of the mining ASICs, at current retail price you'd be looking at a 3 year payback period. That's not much use unless you factor in some dramatic bitcoiin value rises (possible I guess). I realised on thinking further that bitcoin mining will always stabilise at a point where profit is zero for most people, as whenever a better cheaper technology come along, people will buy it until the point where the world's mining capacity pushes the mining difficulty up to the point where profits are zero. Hence, mere mortals sould not expect to make any money. To make money you need an edge - you have to buy the hardware first, and negotiate a big discount. You need to do a deal with the electricity supplier. In short you need to be BIG, like these guys... Or you have to cheat, and get a bot net.

Thursday, 22 May 2014

MtGox Bitcoin Trace Revisited. AKA Where did all the bitcoin go?

I've been gradually refining my software approach to analysing the bitcoin traces I can generate by reading the blockchain. My little python app is now more capable so I can feed better, cleaner data to Gephi, which sorts out the muddle and creates output imagery. There's a fair way to ho yet, but I can now produce some very interesting output.

So, I decided to return to my starting point for all this and look at what happened to MtGox's bitcoin after they "demonstrated control of a vast (well, at least pretty impressive) reserve of coins in 2011 in order to counter suggestions they might not have enough reserves of "real" coin to cover people's trading accounts. It was a bit of a dog and pony show. The new traces show up some very interesting features, including one or two I still find pretty baffling.

So, here is the starting point for the trace, visualised using Gephi. Remember, each sphere is a bitcoin address, and in this case each little arrow is a transfer route that was exercised in the first 5000 transactions after the "show". The MtGox wallet shows up red, a little below centre, with numerous arrows emanating from it. The redder the node, the better it is connected to MtGox. There is quite a lot of "fan-out" even in this image, with bitcoin heading off in lots of directions.


If we pull back a bit we can see this is just part of a wider ecosystem:

Around 5000 addresses - with some noticeable "clumps" of organisation (although nothing on the scale of Satoshi Dice (discussed in a previous post). The group shown below is the biggest.


This one is Instawallet - a "we host your wallet" site that folded stating hacking and loss of coins as the reason. Another of the notable groups is shown below:

The prominent "hub" here is Coinb - another wallet service who also appear to have folded. The presence of these two  (defunct) wallet sites does prompt the question as to whether MtGox could have borrowed the bitcoin they used for the "show"? That has to remain a subject of conjecture - at least for now. Other explanations could be that Gox users withdrew funds and stored them in online wallets - or that Gox stored some of their funds there.

Now here is the strangest feature of all. You have to pull right back to see it.
See the big loop? That appears to be a string of hunderds of addresses, with bitcoin be transacted on and on, along the chain. I have no idea what the idea of this is... Is it an obfuscation method? If so, it isn't a great one as it stands out like a beacon of bizarreness. We can zoom in:

Here is the start of the loop. As you can see, the loop has two strands of addresses, which transact along the chain and also direct a little bitcoin in to the "centre" addresses. The second strand appears to be second pass around the loop - so its really a big spiral. Could be if I run more transactions through the trace we'd get more turns. If anyone can explain what's going on, I'd really like to hear.

I'm off for another round of development of the python scripts - I have a couple more additions in the pipeline that will give more detail.





Thursday, 15 May 2014

Silk Road Seized Coins and SatoshiDice


I have been doing some more development work on my bitcoin tracing software - which consists of a C++ program to read the blockchain and pull out interesting tractions by following them in a kind of daisy chain fashion, plus, latterly, some python scripting to do some initial analysis and write them out in a form that can be read by Gephi - where I do some further analysis and layout. Its working well now. I'm now writing a GML file for Gephi (but currently thinking of switching to GraphML for various reasons).

As an exercise I thought I'd take a look at the Silk Road Siezed Coins address (coins seized when the FBI shut down the Silk Road deep web black market site). I thought it would be interesting to map out a little of the bitcoin geography around there and see what showed up.

The trace came out huge - a million plus transactions so I restricted my analysis to just a few thousand. Here is a zoomed out view of the transaction graph I produced
:
The nodes in yellow are those connected directly to silk road. The lower part of the diagram are loosely connected addresses - mostly in long chains - but the clumping at the top is totally different! Here is a zoomed-in shot of that region:


You can make out some nodes around the edge of the innermost clump with large numbers of connections. The nodes in the middle are clump are special because they are each connected to several of the highly connected "master" nodes. Inedentally there is a yellow node in there. The pink nodes are those two steps removed from "Seized Coins". So what's going on here.

Those highly connected master nodes are all public facing SatoshiDice addresses. SatoshiDice is officially a gambling site. "The Ghost of Satoshi will roll the dice and pick a Lucky Number! " it says. However it is allegedly (from some sources I have read) also used as a tumbler to obfuscate bitcoin trails. To use it you simply send bitcoin to one of its nodes (each pay at a specfied rate) and the nodes send coin back to your address if you win. I can see this would make a powerful tumbler. I'm guessing the nodes are the middle clump are back-end routing ans storage nodes. The fact my trace gets so complex in this area just shows how effecive the mechanism is at creating complexity and confusion. They could be a real challenge to unpick! Perhaps the association with SR was merely that some customers used it to decouple their various wallets for security purposes.

Here is one more screenshot from the vicinity of the SatoshiDice clump. This one is created from a trace extract with over 50k transactions.


Tuesday, 13 May 2014

Just one more Thing

Having ranted about the IoT thing, I thought I should go and check where it originated. It turns out the phrase was coined by Kevin Aston British born (Birmingham) technology pioneer working at MIT. It dates back to 1999 (surprisingly) so I managed to avoid hearing it for a good number of years.

In fact Kevin Ashton worked on Radio Frequency ID (RFID) and the term was originally applied to tagging Things with RFID or QR cards. I.e. the IoT was about cataloguing, tracking and identifying Things, rather than connecting them to the internet.

To me, this idea, whilst it's lower tech, is far more of a world-changer (albeit in a low-key way) than connecting your fridge to the internet so you can see whether your ham has gone off while you're on holiday. And, I'd grudgingly admit, even worthy of its own catchphrase.

Monday, 12 May 2014

The internet of annoying things and vacuous jargon

One of the truly annoying things about the mainstream computing industry is the quest for the Next Big Thing and all the baloney jargon associated with it. Generally things seem to get re-invented on around a ten year cycle and given new names so the excited techno-sheep don't notice as they flock to the next piece of greener grass. All our developers are doing Scrum now - which is new and trendy - but seems to be a coming together of bits of other methodologies that have re-coalesced into something very like what I was doing 15+ years ago (I think we called DSDM back then).

The buzzword that really gets up my nose at the moment is... I can hardly bear to write the words... The Internet Of Things. There - that wasn't too painful - I managed to stop myself throwing up too!

I hate that phrase! It surprises even me that I hate it so much, but I do. Why do I hate it?

  1. It sounds like a separate system - the internet of things. Its not, its just the same old internet.
  2. The "things" (to adopt the meaningless nonsense jargon) do not make up the internet... They are just a small subset of the peripherals connected to it.
  3. The internet has always had "things" connected to it, and been made up of things. Its made up of cables, routers and whatnot and has things called computers, webcams etc. connected to it. And lots of other stuff too - scientific equipment..., you name it. One of the earliest things you could do on the internet was log onto somebody else's supercomputer in some other country and run your jobs on it. It doesn't get much cooler than that.
  4. "Things" is a word that conveys the bare minimum of information. The Things they are talking about are sensors, mechanisms and cool consumer devices. Why can't they just say the new big thing is "cool consumer stuff you can connect to the internet". Far too accessible to normal people I suspect. And less like a bandwagon everyone needs to jump on to avoid being "left behind".
  5. The really cool developments around the internet in recent years - the whole web 2 thing - is the way people can connect to people. Web 2 has been(and is) about the Internet of People. I'd argue that people are far cooler than things as connectees to the internet - they do random and surprisng stuff - like raising money, forming pressure campaigns, trading, making friends, falling in love, being heard. Things can't do any of that. To me, Web 3 will be about extending how people can connect to each other and reach out, not about allowing them to switch on their Sunday roast (although that's fine too - just not very interesting!).
Please, can this IoT jargon go way? I'm hoping for a big social media-led backlash on this one!

Saturday, 10 May 2014

Weird Science

No - I don't mean this.

In this week's New Scientist (I have a copy shipped to Luxembourg from the UK) there was an article about a new interpretation of Quantum Mechanics. Just googling I found Scientific American has a similar article with (I believe) the same title (is this morphic resonance in action again?). The crux of the articles is that much of the wierdness of QM is a mis-interpretation of our observations. I believe Bart Kosko also pointed out years ago in his book Fuzzy Thinking that uncertainty principles (as per Heisenberg's) actually occur in normal physical situations and have an easily explicable basis in our confusion over what we are actually measuring (i.e. kind of a gap between human language and thought and physical reality). Now, in a similar way, it is being argued that quantum mechanics may not be that weird at all. It all comes down to Bayesian statistics and the way measuring something resolves some, but not all, uncertainty about it.  This new take on things sounds very plausible, and quantum mechanics and its consequences have clearly worried many great minds. But all that "spooky interaction at a distance" did sound quite fun. As did Schroedinger's cat. Incidentally the latest article cited a similar thought experiment involving human observers in a big box, opening a box containing Schroedinger's cat. It did cross my mind what adding a further layer or two of boxes and larger and larger observers would do. I had to have a beer to wash that thought away...

Three questions occur to me though:
1) Does this idea lead to the conclusion that the universe is deterministic (with all the randomness QM introduced removed) or is there still an underlying random nature there?
2) Does the removal of non-determinism (if there is one) remove the possibility of free will and free thinking?
3) Does this new interpretation shrink the breadth of "the unknown" in physics leaving less room for the kinds of things Rupert Sheldrake talks about (Morphic Resonance and the like)? How astoundingly dull that would be.

Answers on a postcard please:). Rupert - I'd love to hear your thoughts on this development.

Thursday, 8 May 2014

Is the universe a giant simulation?

Last year I read Rupert Sheldrake's book "The Science Delusion" having seen his banned TED talk. His main theme is that science knows far too little about the universe to pronounce on what can and can't occur within it - which is the counter argument to Richard Dawkins materialist view, as discussed in "The God Delusion" which asserts that science knows everything and there's no room for god. Personally I take Rupert's side (and I do have a PhD in Physics in which to ground my disbelief in the all-powerful nature of science!).

One of Rupert's theories is "Morphic Resonance" - the observation that once a complex interaction has occurred once in nature, it appears to be more likely to occur again. He cites a simple example of newly synthesized materials (never seen before in nature), which initially fail to crystallise anywhere in the world, but magically once someone has done it once, it becomes easy to replicate the process all over the world. Another is animal behaviour. Once a bunch of monkeys in one group work out a new thing to do with a bean tin, other troops seem more likely to come up with the very same idea. His morphic resonance theory suggests that wherever similarity of form occurs there is some kind of connection between the similar entities that transcends physical distance. I guess the fascination of ancient cultures around the world with building pyramids could be caused by the same effect, rather than  those oft-speculated ancient ship journeys.

It suddenly occurred to me (having worked a lot with computer simulations and models) that one way a simulation can be made more efficient is to store the results of very complex calculations so they can be reused each time the same situation needs to be simulated. Its an essential technique when pushing the bounds of the possible and simulating very complex situations.

Bizarrely, if you believe Rupert Sheldrake's many examples of Morphic Resonance, it appears the universe may do something very similar - caching the solutions to complex problems so they can be used again elsewhere! But why and how? I have no answer to that (as yet :) ). But if the universe was actually like a giant simulation - running in some all-encompassing Turing machine, it might well exhibit these kinds of behaviours. Its a weird thought.

Wednesday, 30 April 2014

Bitcoin, why do I love thee?

I'm still working on a bunch of analysis software to follow coin trails through the bitcoin ecosystem. I have downstream tracking pretty much sorted but upstream tracking (where did these coins come from) is more complex using the understanding I currently have. But a thought occurred to me... Why am I doing this project?
There are two reasons

1) There are people out there who are happy to abuse the bitcoin ecosystem - to steal and to cheat ordinary people of their savings. They think bitcoin offers a smokescreen that will hide their dodgy dealings and antisocial activities. The neat thing is that they are probably very wrong... bitcoin provides data analysts with a complete, unabridged and utterly fascinating history of every single transaction, ever. It's truly Big Data - a vast dataset and a big challenge. But Big Data is cool now. Look at what search engines and social media do with their millions and customers... The tools to unravel the bitcoin block chain are possible and will emerge over the next few years and this kind of fraud will become a thing of the past. The big difference between the virtual world of bitcoin and the world we are used to is that with bitcoin, the evidence is locked forever in the blockchain by a method that is "computationally impractical" to reverse (to use the words of Satoshi Nakamoto). It will never fade or degrade. The tools of the near future will look back at the early history of the blockchain and the truth will emerge - sure as eggs.

2) (but probably should be 1!) Bitcoin is a powerful force for change and potentially that's change for good: Just a few points:
  • Bitcoin can connect people with people, anywhere and everywhere and can let them trade, without banks, without currency conversions, without huge fees (just tiny ones) without barriers (without taxes too potentially). It empowers people to do their thing, trade on their own terms and connect person to person. It does what currency should do - it oils the wheels of human commercial endeavour.
  • If people connect with people across the world it can unite us - regardless of our governments, politics, religion, whatever. People are just people - we're all the same and the more we connect, the more we realise that.
  • Bitcoin its technologically amazing. It is a thing of wonder and beauty. Its peer-to-peer centre-less technical architecture mirrors the peer-to-peer social architecture it promotes. And the way it establishes and records an irrevocable and unchangeable truth is a true innovation. I was reading about Merkle Trees today (every block chain block stores its transactions in one) and apparently they are believed to be resistant to attack even by quantum computers (which themselves are largely a theoretical notion (as I understand it!!)). This is technology of a new kind. 
All in all, I love this bitcoin stuff - I really do... I'm new to the field but its definitely got me. I wish I'd discovered it early and made some money mining a lot of coins - but don't we all I guess.

Monday, 21 April 2014

Is Bitcoin a gift from the gods?

The BBC produced this report in 2013...

http://www.bbc.co.uk/news/business-22366064

People kind of took the "alien technology" thing and ran with it:

http://www.examiner.com/article/wake-up-bit-coin-is-an-alien-invasion

A little paranoid! But all the same it's an interesting thought. Look at the evidence. Nobody knows the true identity of the bitcoin creator, Satoshi Nakamoto (generally believed to be a pseudonym although, as widely discussed, Newsweek found and proudly paraded before us a protesting Satoshi Nakamoto from LA, who denied everything and clearly couldn't have written the Bitcoin White Paper); he appears on a forum, introduces his creation then fades away. His creation is a work of brilliance - profoundly innovative. Efforts to analyse the language of the White Paper SEEM to point to a suspect, but I tend to side with people who say it is reads more like the work of several authors. To me, the voice seems to wobble slightly - maybe the work of multiple authors working and discussing the text together. But equally it COULD be the work of a very good but not quite perfect auto-generation/translation...

If we were/are being observed by extraterrestrial beings, the internet would give them a remarkably powerful new way of observing our world and even influencing our culture. And if we are Not Alone, why wouldn't we expect to be of interest to our fellow citizens of the galaxy. I doubt it would be hard for them to engineer a gateway onto the internet. Flying up to a communications satellite and plugging something in would be a neat way - but a software hack would probably be just as easy! Just think of the massive entertainment that would be for an interested other-world civilisation! A lot more fun than watching us through a big telescope or flying down drones.

If bitcoin were an alien introduction, I'd suspect they meant it as a gift to the people of earth. Maybe they are thinking we need a bit less national government on this planet of ours and more freedom for people to trade creatively with like-minded others around the world. I would second that emotion...

Friday, 18 April 2014

MtGox and Instawallet

In my last post I included a trace of the MtGox coin flows immediately after their Nov 2011 parade if 500k bitcoins. At the centre of the drawing there is clear "hub" for a lot of flow. The same hub showed up even better in some of the traces I did today. It turns out that that is Instawallet, which folded on 2013 after breaches in security. I'm wondering, is there a close relationship between MtGox and Instawallet. More particularly was it the security issues at Instwallet that somehow caused the funding crisis at MtGox? Does anyone have more info on this?

Wednesday, 16 April 2014

MtGox Bitcoin Trace

Today I have done the first useful processing of the "transaction trace" I produced from the big MtGox show-of-bitcoins in December 2011. After the "show" the bitcoins were reportedly routed away out of sight in interesting ways (see my last-but-one post). I wanted to do some processing to confirm this.

So, I took my big trace file (around 70mb of csv and still running when I stopped the program), then ran it through a python script to select only transactions above 100 coins (just to get the data volume under control). That yielded a Gephi-compatible ASCII edge file with around 30k edges defined. I put that into Gephi and tried some layout options.  Found Force Atlas 2 was the only layout manager that produced interesting results in less than a couple of hours. But HOW interesting!!!

One thing the exercise showed was that I still have too much data. It pretty much overwhelms Gephi - which is the most powerful tool I know of for this kind of stuff. Reducing the data without loosing valuable info will need some though. But here are a couple of Gephi pics to illustrate things I found.


First off, the above image shows Gephi Force Atlas 2 beginning to untravel the structure of a reduced set of data - this one was transaction size limited to 4000 coins (only transactions over 4000 coins). It looks like a big can of worms (and maybe indeed that's what it is). Remember you're watching over a couple of million Euros worth of bitcoin moving there. The ribbons appear to be chains of transactions with one input and one output. It looks like coins were transferred through a really large numbers of addresses in a linear fashion without being dispersed much.


Now, above is the really interesting one. This is one, detached part of the full graph view. It shows the period immediately after the coins leave the main MtGox address. This time the transaction threshold is set at 1000 coins to show up more fine detail. The trace starts at the small toasting fork feature, off the right hand branch (angled at about 4 o' clock). Just as described by the previous author, they get bucket-chained along for quite a while, then they are split repeatedly smaller and smaller, giving the tree structure. This matches the previous investigation. They seem to get split down below my transaction size threshold and disappear from the trace, only to show up later in the huge ribbon structure. There is also a big recirculating loop structure (upper left) with bigger arrows indicating multiple transactions. Is that a tumbler?
There's a lot more to be investigated here, clearly, but I'm at the limits of the techniques I know right now and I need some fresh ideas to deal with these massive data sets. Still, I'm pleased with these initial results.


Tuesday, 15 April 2014

The game is a foot

Well, I tried to run my trace. Something isn't quite working. First problem - I couldn't verify what the ref'd blog post said about the coin movement from the quoted gox address to the point where "interesting things" happened. In fact the vast trace I produced (70mb when I stopped it) showed no outputs on the  trail from MtGox greater than 100 coins (I'm expecting to see around 400k coins flowing). So... I just looked upstream from the "interesting things" transaction using bitcoin info and found a really interesting flow through what looks like an obfuscater, leading back to a MtGox wallet. The flow starts with this transaction:
31066fbaa7dbfcbde6f7053d5f825c39d0dc3eeafb1fdc9acef4b146e422bf1


from the MtGox address 1LNWw6yCxkUmkhArb2Nf2MPw6vG7u5WG7q to address 1Cj5kAeGK5CQSubY1Bk6HbGdweq15eJtH9. I'm not sure why my app couldn't find this path. Anyway, I'll rerun the trace from there and see where I get to. Could be the obfuscater defeated the trace (a bit worrying!) 

Latest: Okay - I just did a trace through the latter address. It took half an hour or so. This address was a useful filter as it only has one transaction in and one out... effectively a throw-away address used just the once to pass the big coin trail through. Most of the key addresses on this trail have similar - suggesting this was part of the movement plan.

The trace runs to around 1m transactions so it will take a little work to extract the useful info. I may post it raw on google docs for others to play around with it. Watch this space tomorrow.

Saturday, 12 April 2014

Tracing stolen bitcoins wherever they go

I have been working on an application to trace "lost or stolen" bitcoin. I now have a C++ app that will read the entire blockchain and trace transactions from a specified account triggered by a specified date (e.g. the date of the theft). What's powerful about it is that no matter how long and complex the trails, the program can follow them forever and produce a "connected graph" output you can then visualize with a suitable tool. I'll probably start with Gephi - although its certainty not perfect for this task it'll do something with the data and give me a visualisation. I'm hoping to use it on MtGox tonight and should be able to produce a complete trail that shows where all the bitcoin they "demonstrated control of" back in late 2011 ended up. I'm following up this post by a guy who's tried to do this manually but gave up because of the complexity of the way the coins were subsequently split down into smaller and smaller addresses. Hopefully this app will do the whole job. If anyone's interested in following up other coin disappearances let me know and I can run my app (maybe for a small coin donation!).

I'm interested to know if anyone's tried this. Also to know if there are ways of people preventing this kind of trace. I gather a "splitter"or "tumbler" could be used to try to prevent it but not sure how they work... Any thoughts gratefully received.

In writing the code for this I finally (I think!) grasped the way the blockchain and transactions are structured. I did struggle quite a bit to get my head around it, and earlier versions of the code produced traces with anomalies caused by transaction "corner cases". I learned the hard way that bitcoin transactions are more complex and a lot more powerful than I imagined. The fact that each transaction can have multiple inputs (i.e. lumps of bitcoin value assigned into the sending wallet) and multiple outputs (i.e. lumps of bitcoin value) assigned out from the sending wallet to other wallets (and also "change" returned to the sending wallet) was the big issue. This means that a perfect trace of a given coin is impossible as the transaction effectively creates new output bitcoin that is a mix of all the inputs (which were themselves outputs of earlier transactions). It is both beautiful and complex! Effectively my trace should be able to tell you "of the original transaction from address a, value x, y% of the value ended up at address b, having been diluted through mixing with other transactions by z%".

Just a quick question. My understanding, having studied a bit and done this coding, is that contrary to what many people think, bitcoins don't have their own unique identity... There is just bitcoin value that moves between addresses via transactions, mixing in with other bitcoin value as it goes? It would be good to hear others thoughts on this as I want to make sure I got it right. I heard a quote that the first ten times you think you understand bitcoin, you don't understand bitcoin. I think I am somewhere about the 4-5 times mark at the mo.

BTW I found Petri Net notation (or something very much based on it) useful for analyzing the different bitcoin transaction types my code needs to understand.

Thursday, 13 March 2014

Just one more thing...

One more bit of data extracted from the big wallets list... I ran a script to track the build'up of value over time in the "savings" addresses. So here is the graph... Savings as they increase over time. Notice the total climbs up towards 10 million coins (worth around $6bn at market rates) - as I said before, 78% of bitcoin is in savings accounts.


A few noticeable features: 
  • Ongoing increase of savings over time
  • Rapid saving from the off. This is a period when coins were mined by a few individuals for a couple of years and stockpiled in 50 coin blocks (see previous post). Presumably there was almost no spending at this time.
  • Some peaks in saving rate. This may be the times when bitcoin boomed. See the steepest rate is at the end of the line - corresponding with the massive surge of interest.

Wednesday, 12 March 2014

Bitcoin Address


My Bitcoin Address: 1F1njXCvZGtwzWmKrb1txKiFugnaoavTsQ

Monday, 10 March 2014

Bitcoin whales

I've written a little python script to analyse the big bitcoin address dataset I mentioned the other day. When I say little, I mean little (I'm new to python so it was slow going, but worth it as I learned a few basics along the way). What the script does is create a "heatmap" showing how bitcoin value is spread across the addresses - i.e. "where" the bitcoins are. To do the processing I first used Excel to save a csv file (prior to this I changed the number format to get rid of commas in the middle of decimals (which make unpacling the csv harder. Then the python script reads through each line, builds a summary and writes that out as another csv that I can read into Excel to do graphs. Its clunky but works.

So here are the results:


The first graph shows a "raw" heatmap. Its a similar idea to the scatter plot I did in a previous post. The leftmost edge of the chart shows bitcoin value in accounts with the lowest "flow" - i.e. the smallest amount of in/out traffic through the accounts. In fact the lefthand edge itself represents bitcoin with no flow at all. The area to the right of that edge shows bitcoin in accounts with progressively more and more flow. As you can see, most of the bitcoin value is on the left hand edge and there's really much less in the middle. Similarly the area of the chart nearest to you the viewer is bitcoin in low value accounts - that farthest away is in the highest value accounts. By the way, like the original scatter plot the horizontal scales on this plot are logarithmic. The vertical scale is linear so you can read off the actual number of coins involved on the vertical scale. Because of the logarithmic horizontal scale the area nearest the viewer shows tiny address and those on the far edge are monsters.

So, the analysis actually shows that 78% of bitcoin is in no-flow accounts. That is a lot. Only 22% are actively in circulation and being traded etc. This next plot looks "along" the edge - totalling up all the coins for each address activity level. Its on a log scale both ways - otherwise the peak would be overwhelming. This just illustrates the issue. The big peak on the left is bitcoin in accounts with no outflow. Ten million coins or thereabouts - 78% of total value.



Do lets take a closer look at the left hand edge. The next plot is a line graph that is basically that 78% of un-traded bitcoin addresses totalled according to how large a holding the addresses contain.



The left side of the graph shows tiny accounts - the right side shows big ones. As you can see, there is a big spike (we're on a vertical linear scale here so the spike is actual size in terms of coin value... ignore the x axis on the graph as it is not meaningful). The big spike represents dormant bitcoin packaged in 50 coin addresses. I assume these are "raw" mined blocks that are being stockpiled. There are other spikes that suggest there are raw blocks in a range of sizes sitting in cold storage (as well as other small-to-big time savings accounts). The no-flow 50 coin addresses alone contain nearly 20% of all bitcoins in existence.

Okay - one more chart. Here is a plot similar to the above one, but this time plotting the first used time in days before present against the address value size: Here Tiz:

You'll notice the spike again - at the same x axis value. What that's saying is that the 50 coin addresses are very old - it suggests an average of 1500 days. A look at the raw data  confirms that these addresses are indeed very old. They were mined and left in 50 coin blocks - one block per address. They start on 9th Jan 2009 (a week after Satoshi's very first block) and continue at quite a pace for a couple of years. It looks like this must be either Satoshi's holding or one or more of the original inner circle. Strangely the oldest address of all - 3rd Jan 2009, which HAS to be Satoshi's andoriginates in the "genesis block" appears to have been very active - 65 coins value, 891 transactions - last transaction on 17th Feb this year. Tantalizing glimpses.

Of course the big advantage of keeping all these bocks in separate addresses as they have been is that they aren't linked together under one owner. It's impossible to trace any inter-relationship or reconstruct the underlying social network...  you can't tell whether they belong to one individual, a few of the inner circle or a whole bunch of people. But this pattern does suggest a single origin.

I'm going to "mine" deeper into this data - but this set of plots clearly shows the bitcoin "whales" under the surface. When they will break cover is an interesting question.

BTW if you want my python script, lemme know.

Thursday, 6 March 2014

Bitcoin Creator Unmasked

So, the true identity of mysterious creator of bitcoin - the man behind the pseudonym Satoshi Nakamoto is revealed - (in Monday's Newsweek - somehow missed that - news takes a while to reach us in Luxembourg). His true identity... Satoshi Nakamoto - which proves the validity of the old wisdom about hiding things in plain view. Now we know who he is, the world can congratulate him for a truly great world-changing invention.
I was planning to do some work on crunching the bitcoin ID dataset some more but it never happened. I did get the blockchain cruncher to build and run but it seems to throw a bad_alloc exception. I'm guessing it needs a 64 bit platform. I need to take a look at what he's done... It shouldn't need to fit the whole blockchain in memory to process it, surely!! Shouldn't it be just a sequential read-through?

Wednesday, 5 March 2014

Blockchain revelation

I downloaded a monster spreadsheet just released by John Ratcliff (see blog ref below) - created by his fast C++ blockchain reader tool - giving a load of info on the top 100,000 or so bitcoin wallets. I've done some work with excel and just produced this scatter plot. I think it shows this data source is both rich and apparently flawless.


I'l explain what it shows, as the axes aren't labelled (sorry - sloppy).
Each point is a bitcoin address. The vertical axis tells you the total volume flow for that address (in number of coins). The horizontal axis tells you the current value. So, points at the top have the highest all-time flow and points on the right have the highest current value.
The area at the bottom with no points is just because excel maxes out at 32k points. I plotted the highest flows. The very clear diagonal line is just due to the fact that the current value of the address can't exceed its total flow (the coins had to flow in at some point!). The vertical bars are there because the values are rounded to one whole coin. Note, the axes are logarithmic so things at the extremes are much bigger than they seem (the numbers of the axes represent actual numeric values).

Now the interesting initial observations:
One address has a huge flow - 15 times larger than all the rest at some 50 million coins. Its current value is just 177 coins though. Its clearly something very significant.
Many wallets have just one transaction inwards. This includes some of the biggest value wallets. These are clearly hoards/savings
Plenty of wallets have flows above 10k coins but near-zero value. These must be some kind of coin transfer or business nodes.
There is some extremely strange flow-balancing on some of the very high flow addresses - see the horizontal feature high to the left. Around 25 addresses with flows of 390k coins each - what's that all about!?
Further down (and not visible in the plot) there are innumerable one-transaction addresses with value 50 coins each. This looks like a risk reduction strategy for a massive holding. I haven't done the maths yet but this could be one or more very big hauls.

There's lots of fodder here (although I may need to start writing some python to crunch the numbers as Excel is creaking at this kind of data volume). The next thing to get hold of (based on instructions from John Ratcliff is a full blockchain transaction dump). That will be a step up in data volume (more programming!) - but should be interesting!

Tuesday, 4 March 2014

Unlocking the blockchain

After a quick search I found a neat blockchain reader today. here tiz :-) . Thanks to John Ratcliffe for writing this. He says its small but it looks like a good chunk of work. Now I just need to dust off my C++ and get stuck in (must be 8 years since I wrote C++).
I just installed Eclipse CDT on my Ubuntu box, got it to work, then downloaded Subversion, and checked out the source tree from Google Code and I'm now able to build the source. I have some blockchain analysis I want to do... Actually another link from John's site is a good looking paper on some previous analysis here.
Also thanks to John for some links to explain the transaction malleability issue in bitcoin transaction malleability explained. Seems like its just a crude random corruption of the chain that gets corrected but SHOULDN'T cause any loss of coins. Have a read.

Monday, 3 March 2014

Blockchain secrets

Back on the bitcoin theme, I really wondered at the time what the point of the rather sophisticated cyber-attack on bitcoin in the early weeks of this year was. The way I understand it is that the hackers were exploiting the transaction malleability problem  (not that I fully understand what that is) to disrupt and confuse the blockchain (the definitive record of all transactions). It sounds like a very sophisticated hack - suggesting to me a team with a deep understanding of the inner workings of bitcoin. At one time the blockchain was seriously disrupted and forked into two different accounts of the transactions. But it survived and was repaired. Tough little cookie! But who would want to attack in this way? Technophobes (clever ones)? An intolerant state? The financial establishment? Or maybe coin-grabbing criminals? There's no word on the hackosphere about who was doing this as far as I know - which is interesting.
But what was the purpose of the attack? On the face of it it looks like theft - Shadowy trading site Silkroad 2 threw in the towel saying all their bitcoins had "gone" (pretty much). But now MtGox say much the same thing, citing a massive loss of coins. I wonder, was this cyber attack really grand-theft cyber, or was it a massive smokescreen to hide something really bad behind. The attack caused mass confusion and very nearly smashed the blockchain (the definitive immutable record of all bitcoin transactions that have ever occurred). This would have effectively killed bitcoin stone dead. I wonder why anyone would want to do that. On the face of it it makes little sense in theft-terms.

Got Them Bitcoin Blues

The Sunday Times supplement had a great article of yesterday - Desperately Seeking Satoshi - Focussed on the mystery of bitcoin's creator Satoshi Nakamoto. Satoshi the persona appeared on public blogs in 2009, posted the seminal bitcoin paper https://bitcoin.org/bitcoin.pdf (a masterly paper) then, having got the bitcoin community rolling, gradually faded from the blogosphere and finally disappeared  in 2001, saying he had "moved on" to other interests. The article was a very nice piece of work and after much ado pointed out David Chaum as the most likely identity for Nakamoto.
Another article in the main Sunday Times discussed the woes of bitcoin - with a headline suggesting the currency was "crumbling" after the MtGox meltdown. However, it did concede that bitcoin prices were actually holding fairly firm around the $500 mark - In fact I just checked BTC-E and it looks like there's an upward trend - currently trading at $574.
One thing that I haven't heard mentioned is that the whole idea behind bitcoin was the removal of a need for a "trusted broker" in transactions - bitcoin passes direct, person to person with no need for a middleman. The aim was to take banks etc. out of the game because they CAN'T BE TRUSTED. People were supposed to hold their own coins. The thing is, sites like MtGox and the others are exactly what bitcoin was supposed to do away with. What's worse they are totally unregulated. Who knows what went on at MtGox - the various stories make minimal sense to me. I have my theories. But the blockchain could know some interesting stuff about the whole story - and the neat thing is that that isn't going to disappear (ever)!
So in a way, the MtGox calamity is a reminder about what Bitcoin is really about. I believe it will rise again - with a reshaped ecosystem... far greater than before! 

Sunday, 2 March 2014

The weather revisited

Looking back at my last post of 11 months ago I'm quite pleased (kind of) with my weather prediction. To me what we've seen this spring looked pretty much like the tipping point I predicted, into a new and more energetic weather pattern - the crazy storms we've had here at a rate of 2 a week, since before Christmas are quite unprecedented. If the hot/cold biannual cycle is continuing, this March might be super-hot, followed by cold again next year. But maybe we already flipped into an energetic stirred up mode. We'll see very soon.