To the editors:
Jean-Paul Delahaye explains clearly in his article how the bitcoin protocol works, as well as the enthusiasm and fear it elicits. The major innovation behind bitcoin is the creation of an ecosystem that allows for a decentralized consensus. Users no longer need a reliable central authority to interact with each other. While it is difficult to predict how blockchain technology will become important in our everyday activities over the next few years—increasing our happiness, or not—the enthusiasm it elicits is growing. It could have many applications, of which cryptocurrencies are only an initial example. Among the platforms arousing great interest is Ethereum, which allows users to manage and interact with smart-contracts. Ethereum promises applications that work exactly as they were programmed and cannot be interrupted, censored, or modified in any way. It has been spoken of as one of the building blocks of web 3.0!
Understanding precisely how all this works in practice is a fascinating subject. How can a user, interacting with the network via only his smartphone, quickly verify that a transaction has been saved on the blockchain? One approach uses the concept of a Merkle tree. This is an important data structure used in computing and cryptography, and I would like to follow up Delahaye’s article by explaining this point. Another legitimate question that one might ask with reference to the mechanism of transaction validation is: why do bitcoins have real economic value?
There is more to bitcoin than simply a currency of exchange. To illustrate this, I will examine in some detail a transaction inscribed in the bitcoin blockchain. This has nothing to do with any financial transaction, on the contrary, it is connected to the idea of using the blockchain to write information, permanently, that is tamper-proof and visible to everyone. This way of “hacking” the bitcoin blockchain—that is, to diverting it from its primary use, although this “hack” is fully authorized by the protocol—immediately opens up a world of possibilities. I will illustrate this using an example application for which a platform like Ethereum is perfectly adapted.
A Merkle Forest
Creating a peer-to-peer network allowing one to transfer “money” without resorting to a central authority like a bank, is, it must be said, a conceptual challenge. It is an even greater challenge when you keep in mind that, unlike a gold ingot, a computer file can be duplicated infinitely many times at almost zero cost. Yet Satoshi Nakamoto’s ideas, as described in his article, “Bitcoin: A Peer-to-Peer Electronic Cash System,” posted on the web in 2008, make this possible.1 Far from being extraordinarily complicated, his ideas are, on the contrary, a clever arrangement of simple building blocks. Some of these have been well known for around forty years and have been widely used in many cryptographic systems. It seems almost a miracle that a currency can be supported by a shared computer file and that it works. It is a miracle—and in the case of bitcoin, a currency—that depends entirely on mathematics.
In terms of cryptography, the bitcoin platform relies on two fundamental mathematical concepts: hash functions and electronic signatures. One might think of the hash of a computer file as its digital fingerprint. The initial computer file can be potentially very large (from a few kilo- to several giga-bytes, depending on whether it is text, image, music or video), while its fingerprint is only a few bytes. Calculating a hash is a very fast process, consumes few resources, and is an irreversible operation, since it is impossible, in practice, to reconstitute the original file from its fingerprint. Importantly, it is extremely unlikely that two distinct files will produce the same hash. The electronic signature of a message proves that the person claiming to be its author is indeed its author. The underlying mathematics ensures that it will not suffice to study in detail the messages signed by a person to be able to appropriate his signature and sign new messages without his knowledge.
Bitcoin relies on the blockchain, a data structure that allows the data to be linked together. How? By organizing the data in blocks and indicating in the header of each block the hash of the preceding block. Modifying a datum in a block modifies its hash and is therefore reflected in all subsequent blocks. Each block organizes the transactions it contains according to a Merkle tree, a data structure first described by Ralph Merkle in 1979. Imagine that one wants to verify that some data belongs to a set of data that is known in advance. In the case of the bitcoin blockchain, one wants to know whether a particular transaction has been recorded in a given block. Compared to what might be described as the “naïve” method, i.e. recovering the block of transactions and comparing the known transaction with each of the transactions in the block, a Merkle tree answers this question in a way that reduces calculation and data transfer overheads.
A Merkle tree is a rooted binary tree; the data are attached to the leaves, the hashes of the corresponding data are attached to the parents of the leaves, and the hashes of the two child nodes are attached to all the other nodes. The root of the Merkle tree contains a fingerprint of the whole tree. If someone wants to verify that a transaction belongs to the block in question, and not simply trust a network node asserting that this is the case, he have to ensure that the block belongs to the blockchain and to recalculate what the root of the Merkle tree of the block should be. He can then compare it to the value indicated in the header of the block. One does not need to know all the transactions from the block; it is enough to request the hashes of the branch that goes from the leaf containing the particular transaction all the way to the root. This represents a very small amount of data when compared to that comprising all the transactions in the block. For the same reasons, if one changes a datum in a leaf, it is not necessary to recalculate the entire Merkle tree, but only the corresponding branch all the way to the root. One of the great strengths of this data structure is that it allows users to connect to the network via light clients on their smartphones and to verify the transactions of interest. Of course, nothing beats a full client, one that stores the entire blockchain and ensures the validity of all transactions without the help of any third party; these nodes can calculate the balance of a bitcoin address and trace precisely the origin of the bitcoins at the address in question.
Why Do bitcoins Have Real Economic Value?
The simplest ideas are often the most brilliant. Nakamoto’s ideas are a prime example. The main problem in such a system is, of course, how to find a consensus about the validity of transactions without a central authority, and when the network’s nodes do not a priori trust one another.
Each block header contains the hash of the previous one, the root of the Merkle tree of the transactions contained in the block (and thus a fingerprint of all the transactions in the block!), a timestamp, and a nonce. Anyone can thus, in the blink of an eye, verify the validity of a block, while the miners have had to work hard to find the nonce in question.2 The winning miner is the first to provide a proof of work—i.e., the value of the nonce which makes it so that the hash of the header of the block in question is less than a certain value. This value is directly related to the computing capacity of the entire bitcoin network and periodically adjusted. Since the distribution of hashes is random, a miner has no choice other than to randomly take a nonce and see if it works... and, if it doesn’t work, to start again. This is where the system acquires real economic value, moving from mathematics and computer science to physics, if you will. The cost for a miner to validate a block is significant: the purchase and maintenance of computer equipment, and, above all, significant electricity consumption, and thus the degradation of energy into heat. A mined block gets its value from computing, in the same way that a gold ingot acquires value because it is a rare metal. If it were easy to mine a block, a bitcoin would have no more value than a liter of air! That one can exchange goods or services for the result of a calculation is, in contemporary terms, a totally disruptive idea.
For their work, miner are paid in accordance with the two basic rules of the bitcoin protocol: on the one hand, one may only transfer accrued bitcoins, and, on the other hand, only miners can create new bitcoins ex nihilo. But in a dispersed world, two valid blocks may be found at almost the same time at two ends of the network. The two blocks do not necessarily contain the same transactions, and may even contain contradictory transactions. In other words, a user of the network tries to spend the same bitcoins twice. Two blockchains then find themselves in competition, a situation we call a fork. But, a few minutes later, other blocks will be mined. A consensus is reached by keeping the blockchain with the highest “computing value.” Again, the central notion of “computing content” arises, one that gives a real economic value to the bitcoins, since consensus is established by giving credit to the miner who has worked the most.
Why do miners agree to play the game and why do they finally come to a consensus? If you are a miner, it means that you have purchased the necessary equipment and are willing to consume a significant amount of electricity. When you look at the bitcoin protocol in detail, you realize that there is no economic incentive to hack the network; everything was designed so that attacks, in the end, are far more costly than any benefits that could be derived from them. Miners thus have a vested interest in “behaving well,” that is, in complying with the protocol. This is how consensus emerges at the global level for validated transactions between actors who do not know each other and have no particular a priori trust in each other. Believe it or not, this has now been working for eight years!
There is, however, a downside to this. It is indeed possible to alter the proper functioning of the system if someone manages to capture at least 51% of the computing power of the whole network. This is unlikely, but not completely unimaginable, because of the concentration of miners into pools. There remain things a hacker would not be able to do. In particular, he would never be able to spend bitcoins that are not his. On the other hand, he would be able to spend his own bitcoins several times over, and to deny service by refusing to validate certain transactions. If this were to happen, nothing would prevent miners involved in the 51% from jumping ship if they did not find the situation to be in their interest. No miner—unless he wants to lose all the money he has invested from the beginning—has any real incentive to make the whole system collapse.
Indelible Ink
As of 2013, some users of the bitcoin platform have tried to write information unrelated to financial transactions into the blockchain. Why? Because once written on the blockchain, it will remain there ad vitam eternam. It is not difficult to imagine some types of information, such as the terms of a contract, that we would like to always have available, from anywhere in the world, and maintained in a way that they cannot be tampered with. The bitcoin blockchain was not designed for this purpose, but with a little ingenuity, it is nevertheless possible to implement this type of functionality. This has pleased some people, because it demonstrates, if this is still necessary, the potential of the blockchain. It has not pleased others, who see this type of data as contaminating the protocol at two levels. On the one hand, it can greatly increase the size of the blockchain and, on the other hand, transaction outputs (UTXO) are created that can never be unlocked but will remain consigned forever to the group of all UTXOs. In 2014, a new instruction for the bitcoin script language appeared, OP_RETURN, offering a compromise. With this instruction, it became “legal” to deposit up to 40 bytes per transaction and the problem of non-unlockable UTXOs also been resolved.
Many websites allow real-time visualization of the bitcoin blockchain.3 Consider the transaction whose hash is
8bae12b5f4c088d940733dcd1455efc6
a3a69cf9340e17a981286d3778615684.
We can see that it was validated on June 30, 2014 but we can also see that the first output script is
OP_RETURN 636861726c6579206c6f766573206865696469.
Behind the instruction OP_RETURN hides a message written in hexadecimal. A conversion from hexadecimal to utf-84 reveals the message
charley loves heidi.
A declaration of love inscribed forever in the bitcoin blockchain!
Smart-Contract or PayPal?
Imagine three mathematician friends who would like to explain to a wide audience the fourth dimension, or the butterfly effect.5 After much labor, these friends complete a film, using synthesized images, about each subject. Since they wish to distribute their two films as widely as possible and for free, a platform like YouTube would suit them well.6 Suppose also that they decide to burn a small batch of DVDs to offer to teachers and students in countries where access to the internet is limited. They decide to sell a double DVD containing the two films at a relatively modest price, covering printing and shipping costs, but that will also guarantee that one out of every three DVDs is free. On their website, they create a page for orders.
To collect payments, they decide to use PayPal, an online payment service that allows users to pay for purchases, receive payments, and to send and receive money. PayPal takes a commission (not negligible, in fact) from each transaction and pays the remainder into the account belonging to the three friends. The system works well and our three friends are, on the whole, content.
All the same, they have two small regrets. The three friends give regular lectures, which are perfect opportunities to distribute their DVDs. This is a little awkward, in practice, because they have to collect money from buyers, either in cash or checks, and then deposit it in a bank. Not forgetting, of course, buyers who want to pay via a credit card... Additionally, the three mathematicians had hoped that, for every hundred DVDs sold, they could offer the next order free of charge. This type of gift is not easy to arrange via PayPal. While our friends have no intention of cheating, their sole aim being to make their DVDs available to the biggest audience possible, buyer have to trust them regarding the exact number of DVDs sold. Buyers have no way of verifying that their order was not the (n + 1)th purchase, n being a multiple of one hundred.
The three mathematicians released their films in 2008 and 2013, so they could not have known about the Ethereum platform, available only since 2015. Ethereum solves both these problems: payment without intermediaries and without constraint (offered, since 2009, by the bitcoin blockchain) and the gift that cannot be intentionally forgotten.
Ethereum: the New Generation
In a sense, Ethereum is a decentralized computer that runs on many nodes. This “world computer,” if one could describe it as such, is accessible from anywhere, no central authority controls access, and the operation of the programs it executes are guaranteed to be free from alteration or interruption. Presented in this way, it looks fantastic.7 And it is! There are, however, two caveats: this world computer is barely more powerful than a mobile phone of the 1990s, and it is necessary to “pay” to execute a program. In any blockchain-type platform, there is always a balance between economy and technology.
The philosophy behind a platform like Ethereum is that of a truly decentralized internet network. Today’s web, on the other hand, often offers us access to centralized services, whether in the form of a bank, a social network, a music platform, or a carpooling platform. In many cases, it is not unreasonable to limit the level of trust one places in these “central authorities,” which levy not insignificant fees for services rendered and which can generate immense profits from user data. If the server hosting the service suddenly becomes inaccessible, there is no recourse available to users. A list of grievances against the platforms we use on a daily basis could be extended, but that is not the point here. Keeping in mind that the internet is nothing but an immense network of public and private subnetworks, one can appreciate why the governing role played by certain nodes is not to the liking of the most liberal minds.8
As I have tried to illustrate, using the example of the DVD project by our three mathematician friends, the boundaries of Ethereum are quite different from those of Bitcoin. That said, both platforms allow users to record financial transactions, in ethers and bitcoins, respectively. But, of the two, the ether is oriented more towards paying for the operation of the system. In other words, users pay using ethers for smart-contracts they want to run. The Ethereum blockchain is structured around the idea of accounts and, in a way, is simpler than the bitcoin protocol, because one has direct access to an account’s balance. This point is an important difference between the two blockchains. In the Ethereum model it is not necessary to recalculate the entire transaction tree to know the balance of an account—all the more interesting for light clients who, as discussed previously, do not keep a copy of the entire blockchain, only the headers.
The Ethereum blockchain is much faster than that of Bitcoin. The delay between two blocks in the bitcoin system is around 12 seconds. The propagation time of a block through the network, understandably, poses de facto new challenges. The Ethereum protocol provides solutions in both cases. Moreover, and this is the great innovation of this platform, one can arbitrarily store data on the blockchain—by which I mean smart-contracts—that are, in fact, programs written in a complete Turing language. There is thus no restriction on the complexity of programs that can be deposited on this particular blockchain.
Smart-contracts are programs stored and executed in the Ethereum blockchain. A financial transaction, as is the case in the bitcoin platform, is an elementary example of a smart-contract. If the author of a smart-contract has provided a user interface, or front-end, the smart-contract can be seen as the back-end of a decentralized application. A mined transaction is ultimately a public record that a certain smart-contract has been executed with specific inputs and that it has produced specific outputs. This can be verified by any network node. Everything is public on the blockchain and a smart-contract is filed for eternity.
It is important to understand that a smart-contract is a simple program stored in the blockchain, but that this program is also able to modify the state of the blockchain. When one wants to interact with a smart-contract function, one sends transaction request to the network indicating the address of the smart contract as well as the data required to perform the function in question. Ethers are included as payment for the transaction. In the previous sentence, “one” is in italics because it can designate a user as well as another smart-contract. In other words, nothing prevents smart-contracts from interacting with one another. Imagine, for example, an active investment contract which is, on the one hand, a securitization contract that itself recovers the repayment in a loan contract, and on the other hand, a rating contract charged with recovering information about repayment in a list of loans, etc. When Ethereum miners receive the transaction, they retrieve the corresponding smart-contract, execute it, and thus move the blockchain from one state to the next.
What are the constraints or limitations of Ethereum? Firstly, the execution of a smart contract is necessarily initiated “from the outside.” In other words, a smart-contract cannot decide, autonomously, to be executed. It is not possible to program a “scalper” smart-contract that would spend its time monitoring a share price, to buy or sell as the case may be. Nor can a smart-contract use a web service, that is to say, external data provided by a web service. There are at least two reasons for this: on the one hand, this would be a problem if, at the time of execution for the smart-contract, the data service was unavailable and, on the other hand, if one wanted all the nodes of the network to rigorously perform the same task, which would require them all to have access to the same program inputs. At the heart of the protocol is the principle that “the executer pays.” Payment must be made, in ether, of course, for the execution of a program and invoicing (which is established in gas—the execution fee) is directly related to execution time and the size of the inputs. It automatically limits the capabilities of any pranksters who might try to “saturate” the platform by asking it to execute anything and everything... A final point: nothing prevents the use of the Ethernet blockchain for data storage, as a kind of decentralized Dropbox, since all the nodes of the network store the data in question. But, as you might have gathered by now, payment would be required. And it would be extremely expensive... Thus the best way to immortalize the piece of music you just composed, or the novel you just finished, is not to store the entire work, only its hash.
The Adventure Begins
In this vision of a decentralized web, Ethereum represents an important first step, and one that has aroused a great deal of enthusiasm.
In the near future, we should expect to use, alongside our traditional web browsers, a browser such as Mist for managing a connection to Ethereum nodes, managing one’s identity, making payments, activating smart-contracts, and so on. The back-end of the application will hold the corresponding smart-contract and, eventually, one imagines that both front-end and back-end will be found on the blockchain.9 All this is still just a dream, but every day new developers join an enthusiastic and active community. We can already see how this technology might significantly impact whole sectors of the economy, from music to health to the “internet of things.” Not to mention even bigger upheavals, promised by some, in all kinds of organizations, including at the state level. Science and technology may now be placing in our hands the tools to invent new forms of democracy.
Aurélien Alvarez
Jean-Paul Delahaye replies:
My thanks to Aurélien Alvarez for this letter, which rounds out my article by focusing on a series of points which are all very interesting. There are, of course, many other aspects that deserve to be described in such detail about the revolution started by Satoshi Nakamoto... but then it would no longer be an introductory article, but a book!
Translated from the French by the editors.