Saturday, 12 April 2014

Tracing stolen bitcoins wherever they go

I have been working on an application to trace "lost or stolen" bitcoin. I now have a C++ app that will read the entire blockchain and trace transactions from a specified account triggered by a specified date (e.g. the date of the theft). What's powerful about it is that no matter how long and complex the trails, the program can follow them forever and produce a "connected graph" output you can then visualize with a suitable tool. I'll probably start with Gephi - although its certainty not perfect for this task it'll do something with the data and give me a visualisation. I'm hoping to use it on MtGox tonight and should be able to produce a complete trail that shows where all the bitcoin they "demonstrated control of" back in late 2011 ended up. I'm following up this post by a guy who's tried to do this manually but gave up because of the complexity of the way the coins were subsequently split down into smaller and smaller addresses. Hopefully this app will do the whole job. If anyone's interested in following up other coin disappearances let me know and I can run my app (maybe for a small coin donation!).

I'm interested to know if anyone's tried this. Also to know if there are ways of people preventing this kind of trace. I gather a "splitter"or "tumbler" could be used to try to prevent it but not sure how they work... Any thoughts gratefully received.

In writing the code for this I finally (I think!) grasped the way the blockchain and transactions are structured. I did struggle quite a bit to get my head around it, and earlier versions of the code produced traces with anomalies caused by transaction "corner cases". I learned the hard way that bitcoin transactions are more complex and a lot more powerful than I imagined. The fact that each transaction can have multiple inputs (i.e. lumps of bitcoin value assigned into the sending wallet) and multiple outputs (i.e. lumps of bitcoin value) assigned out from the sending wallet to other wallets (and also "change" returned to the sending wallet) was the big issue. This means that a perfect trace of a given coin is impossible as the transaction effectively creates new output bitcoin that is a mix of all the inputs (which were themselves outputs of earlier transactions). It is both beautiful and complex! Effectively my trace should be able to tell you "of the original transaction from address a, value x, y% of the value ended up at address b, having been diluted through mixing with other transactions by z%".

Just a quick question. My understanding, having studied a bit and done this coding, is that contrary to what many people think, bitcoins don't have their own unique identity... There is just bitcoin value that moves between addresses via transactions, mixing in with other bitcoin value as it goes? It would be good to hear others thoughts on this as I want to make sure I got it right. I heard a quote that the first ten times you think you understand bitcoin, you don't understand bitcoin. I think I am somewhere about the 4-5 times mark at the mo.

BTW I found Petri Net notation (or something very much based on it) useful for analyzing the different bitcoin transaction types my code needs to understand.

No comments:

Post a Comment