Chapter 2. Understanding the blockchain

This chapter covers

  • Low-level details of an Ethereum node
  • The technology stack you use to build Ethereum Dapps
  • Technologies underlying the Ethereum blockchain
  • Ethereum’s history and governance

I intended chapter 1 to give you a high-level overview of decentralized applications without overwhelming you with too many details. Consequently, I’m sure you’re still wondering what technical stack you need to learn to build a full Dapp. Also, you might feel the architectural presentation on Dapps didn’t go as far as you’d have liked, and you might still have doubts about how a blockchain exactly works. If you’re asking yourself these questions, I’ll address them in this chapter.

I’ll start by revisiting the voting Dapp I introduced in the previous chapter, and I’ll cover some aspects of an Ethereum node I skipped earlier for simplicity. I’ll then cover the entire technology stack required to implement a full end-to-end decentralized application. Additionally, I’ll introduce the cryptographic concepts and foundations you need to acquire to appreciate how a blockchain works. Before closing the chapter, I’ll present technologies specific to the Ethereum blockchain and give you some information on Ethereum’s history and governance.

2.1. A deeper look at decentralized applications

When I presented the structural and transactional views of a decentralized application in chapter 1, I decided to keep them at a relatively high level. I’m aware blockchain technology might be completely new to you, so I wanted to make sure you understood the high-level architecture and the purpose of decentralized applications without confusing you with too much jargon and too many technologies. Now that you’ve acquired a solid foundation, it’s time to have a deeper look at Ethereum Dapps. Let’s start by stepping into an Ethereum node.

2.1.1. Inside an Ethereum node

As shown in figure 2.1, each node of the Ethereum P2P network contains two main components:

  • An Ethereum client—This acts as a runtime and contains four elements:

    • A virtual machine called Ethereum Virtual Machine (EVM), capable of executing smart contract code generally written in a language called Solidity and compiled into EVM bytecode.
      Figure 2.1. An Ethereum node includes an Ethereum client and a blockchain database. The client contains a client process, an Ethereum Virtual Machine, a memory pool, and a JSON-RPC API exposing the functionality of the node externally. There are two types of nodes: full nodes and mining nodes.

    • A memory pool, where the node stores transactions that it receives, such as a vote submitted by a voter from the client side, before it propagates them further into the network.
    • A client process, which coordinates the processing. It handles incoming messages and transactions, dispatches them to the EVM when appropriate, and stores transactions to, and retrieves them from, the memory pool. The client process also handles incoming blockchain blocks that the node receives from peer nodes and appends them to the local copy of the blockchain database.
    • A JSON-RPC API, which exposes the functionality of the client to other nodes and external users.
  • A blockchain database—Apart from transaction data, such as votes submitted by voters, the blockchain also keeps a copy of the EVM bytecode of all smart contracts deployed on the network and holds their state. Mining nodes append new blocks to the blockchain regularly, every 15 seconds.

2.1.2. Revisiting the lifecycle of a transaction

Now that you know an Ethereum node hosts a JSON-RPC interface, an EVM, and a memory pool, I can explain to you, with the help of some diagrams (figures 2.2 to 2.4), what role they play during the transaction lifecycle.

Figure 2.2. The lifecycle of a transaction. A voting transaction is created when a function is invoked on a smart contract on a chosen Ethereum node through the JSON-RPC interface. The node places the transaction in the memory pool and executes it on the EVM for validation. If the validation is successful, the transaction is broadcast to peer nodes until it reaches a mining node; otherwise, it dies out.

A transaction is generated when a function is invoked on a smart contract of the chosen Ethereum node through the JSON-RPC interface. (See figure 2.2.)

  1. A full node receives the transaction from a peer node and places it in the memory pool. (See figure 2.2.)
  2. The full node executes the transaction on the EVM for validation. (See figure 2.2.)
  3. If the validation is successful, the node broadcasts the transaction to its peer nodes. If the validation is unsuccessful, the node doesn’t propagate the transaction further, and it dies out.
  4. A mining node places the transaction received from a peer node in the memory pool. (See figure 2.3.)
  5. The mining node picks transactions deemed to be profitable from the memory pool, executes them on the EVM, and tries to add them onto a new block. (See figure 2.3.)
  6. If the block created is added successfully to the blockchain, the mining node removes the related transactions from the memory pool. (See figure 2.3.)
    Figure 2.3. A mining node receives the transaction from a peer node and places it in its memory pool. The node later picks it and executes it on the EVM, among other transactions, to place it on a new block. If the block is appended on the blockchain, the transaction is removed from the memory pool and the block is broadcast to peer nodes.

  7. The node broadcasts the new block to peer nodes. (See figure 2.3.)
  8. A full node receives the new block from a peer node. (See figure 2.4.)
  9. The full node executes all the block transactions on the EVM for validation. (See figure 2.4.)
  10. The node removes all the associated transactions from its memory pool if the block has been validated successfully. (See figure 2.4.)
  11. The node broadcasts the block to peer nodes. (See figure 2.4.)
Figure 2.4. The full node’s process, from when it receives the new block to when it processes all its transactions on the EVM for validation, then, if validation is successful, removes the related transactions from the memory pool and propagates the block further into the network

2.1.3. Development view: Deploying the voting smart contract

By now, you should have a good idea of both what a decentralized application looks like and how a transaction flows throughout the system. You might still be wondering, though, when and how a smart contract gets propagated throughout the network. It turns out that the server-side contract propagation process is similar to that of a standard transaction, such as the voting transaction analyzed in the previous chapter in figure 1.8.

An Ethereum smart contract, such as the voting smart contract of the voting Dapp, is code written in the Solidity language. A smart contract developer compiles the code into EVM bytecode and then deploys it across the P2P network through a contract deployment transaction, which executes on a local Ethereum node and then propagates throughout the network. During its propagation throughout the network, a mining node processes the deployment transaction and stores its EVM bytecode on the blockchain, as illustrated in figure 2.5.

Figure 2.5. A developer writes the voting smart contract in the Solidity language, then compiles it into EVM bytecode and inserts it into a contract deployment transaction. This is pushed to the local Ethereum node and propagated throughout the network. It’s then mined and appended to the blockchain.

You might have noticed, while going through the static, dynamic, and development views of a Dapp, in these two initial chapters, that I’ve mentioned languages and Java-Script libraries that might be unfamiliar to you. You shouldn’t be particularly worried about the amount of technology you’ll have to learn. You can implement a Dapp based on the Ethereum blockchain with languages much like those used in centralized apps you’re already familiar with. The client side of a Dapp is generally based on standard HTML5 + JavaScript; the communication layer between the UI and the server side is based on a JavaScript library called Web3 that’s executed on the client side; and you can implement server-side smart contracts in Solidity, which is a flavor of JavaScript.

Your journey through this book will continue, as shown in figure 2.6, from the server side, which is the core of decentralized applications, and you’ll write smart contracts in Solidity. Then you’ll learn how to interact with a smart contract remotely through the Web3.js JavaScript library. Finally, you’ll implement a web-based UI, built on HTML and JavaScript.

Figure 2.6. You’ll progress from writing smart contracts in Solidity, to interacting with smart contracts remotely through the Web3.js Java-Script library, to building a web UI in HTML and JavaScript.

In summary, with some knowledge of JavaScript, or any C-like language, it isn’t difficult to transition from centralized to decentralized application development. But during that transition, it’s important to fully understand the technologies underlying decentralized applications, because they’re rather different from the technologies that centralized applications are built on. We’ll explore that in the next section.

2.2. What technologies make Dapps viable?

As you know, a Dapp is based on business logic encapsulated into smart contracts that are executed against a distributed database called blockchain. Blockchain technology is based, in turn, on public key cryptography, cryptographic hash functions, and the concept of consensus, which you can implement using proof of work and proof of stake algorithms, among other ways.

You might be feeling like I keep opening more and more Russian dolls, and this might never end, but please don’t get frustrated! Cryptography is the lowest level I’m going to cover, I promise.

2.2.1. Blockchain technologies

In the next several sections, I’ll explain briefly all the cryptographic terms I’ve just mentioned so you can form a mental model of how a blockchain database works before we proceed further. Public key cryptography is the lowest technological block underlying the blockchain, so let’s start from there.

Public key cryptography

Public key cryptography is an encryption methodology based on a pair of keys: a private key, usually generated randomly, which is known only to its owner, and a public key, known to everyone, generated from an algorithm that takes the private key as an input. Figure 2.7 illustrates how private and public keys are generated.

Figure 2.7. A private key is generated with a random number generator. It’s then fed to an algorithm that generates a public key.

To better visualize it, think of the private key as the physical key of your mailbox (only you have a copy of it) and the public key as your postal address (everyone knows it), as shown in figure 2.8.

Figure 2.8. To understand the purpose of private and public keys, you can think of the public key as your postal address, known by everybody, and the private key as the key to your mailbox, only owned by you.

The private key has two main purposes, as illustrated in figure 2.9:

  • It allows the decryption of data that has been encrypted using the public key.
  • It allows someone to digitally sign a document. They can produce the signature only if they know the private key, but anyone who knows the public key can verify the signature. As you’ll see, the authenticity of smart contract transactions relies on digital signatures.
Figure 2.9. You can use a private key to decrypt a document that has been encrypted with the related public key, as you can see in the top diagram. As shown in the bottom diagram, a private key also allows someone to sign a document digitally to prove provenance. The generated digital signature can then be verified against the document and the related public key.

In the context of a blockchain platform, cryptocurrency is generally stored against an account that is identified by a public key but can be operated only if you know the private key. If the private key is forgotten or lost, no one can use the account anymore, and its funds are considered lost.

Cryptographic hash functions

A hash function is any function that can map data of arbitrary size to data of fixed size. The fixed size data is called hash or digest. To give an example, you can design a hash function so that it always generates a 64-bit hash from a file or string of any size. Whether its size is 10 KB or 10 GB, a 64-bit hash will be generated, as illustrated in figure 2.10.

Figure 2.10. A hash function produces a hash of a fixed size (64 bits in this example) given an input of any size.

A cryptographic hash function is a hash function that has five additional properties:

  • It’s deterministic. The same input will always generate the same hash.
  • It’s quick to compute.
  • It’s a one-way function, unfeasible to invert. This means that the only way to deduce the original data from its hash would be to try, by brute force, to obtain the same hash by applying the function to an enormous number of input data sets.
  • It should be almost impossible to obtain the same hash from two different sets of input data. Although a small chance exists that two inputs might produce the same hash, it’s impossible to determine them a priori, without applying the function to an enormous number of inputs, as suggested in the previous point.
  • A slight change in the input data should produce substantially different hash values. Consequently, also because of what I said in the previous point, unless you’re applying the cryptographic hash function to the same input, you won’t be able to intentionally get the same hash or even a close one.

Given these properties, think about the following scenario. Imagine you’re writing a check for $30 to pay for the latest blockchain book from your local bookstore. I know, checks are almost no longer used and, if you’re a young reader, you might never have seen one! Please bear with me for a moment.

You filled out and signed the check, and you’re on your way to the bookstore, when, while on your mobile phone chat app, you trip over a curb. You don’t realize the check falls on the road and a gust of wind takes it away. You’re so unlucky that it ends up in the hands of Jack Forger, a local petty criminal. He knows how to remove ink, and he quickly replaces the amount and recipient as shown in figure 2.11.

Figure 2.11. A physical check forged by reusing the original signature and altering the recipient and amount

Jack then goes to a bank and successfully cashes the check for $30,000. The criminal had your handwritten signature and was able to replace the name of the recipient and the amount. Let’s see how a digital signature on an electronic check would avoid this unpleasant situation.

The digital signature on an electronic check would be a cryptographic hash produced using as the input the details of your check, the amount you’re paying, and the recipient, together with a private key associated with your bank account (the equivalent of your handwritten signature), as illustrated in figure 2.12.

Figure 2.12. An e-check can be secured with a digital signature generated with the private key associated with the sending bank account and the details of the check. It can be verified by checking the digital signature against the public key associated with the sender bank account and the check details.

When someone presents this kind of electronic check to a bank, together with the public key associated with your bank account, the bank can verify that the digital signature matches the details of the check (amount and recipient) and has been produced using your private key. That’s how the bookstore owner will be able to cash your check.

Because the digital signature is a cryptographic hash, that exact signature can only be produced from the specific details you used when filing the electronic check. If someone tried to hijack the electronic check—let’s say a group of skilled hackers—changing the amount and, most importantly, the recipient would be pointless for two reasons:

  1. A new amount or recipient would generate a completely different digital signature, so the bank wouldn’t recognize the current one as valid, as shown in figure 2.13.
  2. If the hackers attempted to generate a new digital signature with new check details, they couldn’t generate one that could be associated with the public key of your bank account because they don’t know your private key.
Figure 2.13. An attempt at forging an e-check secured by a digital signature is unsuccessful because the new original digital signature doesn’t match the altered check details.

Blockchain transactions are much like the electronic check described here:

  • They originate from an account identified by a public key.
  • They contain transaction details, such as an amount of cryptocurrency and the recipient, also identified by a public key.
  • They carry a digital signature proving the transaction details have been entered by the owner of the sender account through their private key.

Blockchain transactions don’t have to carry cryptocurrency; they can carry any data. The crucial point is that by carrying a digital signature, they can prove they’ve been genuinely sent by the sender.

Cryptographic hash functions aren’t only useful for digital signatures. If you’re interested in finding out more, read in the sidebar how you can use them to protect a seller from malicious buyers.

Protecting a seller from malicious buyers with a commit-reveal scheme

Cryptographic hash functions can be handy in various situations. Do you remember the decentralized e-commerce application I described at the beginning of chapter 1? If you’re reading this book with the mindset of a seller, you might have found the solution not as convincing as when seen through the eyes of a buyer. For example, there seems to be nothing, in the solution presented, preventing the user from accepting the goods and then not authorizing the payment to the seller. That’s disappointing! Don't despair: cryptographic hash functions to the rescue!

You could make the application more secure for sellers if you required the buyer to generate a secret code, for instance a secret phrase or a random number, and then supply its cryptographic hash to the seller during the confirmation of the order. You could view this hash as a sort of keylock for the payment. When delivery comes, the courier would hand the goods over only upon receipt of the secret code, which, when supplied to the e-commerce Dapp, would generate the expected initial hash code and, as a physical key into its associated keylock, would unlock the payment.

This way of initially providing a hash of the original information and then revealing the full information in a second stage is called a commitment scheme or commit-reveal scheme, and it has two phases:

  1. The commit phase, during which a cryptographic hash of the original information produced with a disclosed algorithm is committed to the other party
  2. The reveal phase, during which the full information is revealed, and it’s verified against the committed hash to prove the revealed information is indeed associated with the hash

This powerful idea of proving the knowledge of some information without revealing the information itself had already been used in the 16th century by Galileo, who initially published his discovery of the phases of Venus in an anagram of the original paper, before finalizing his research. Hooke and Newton later used a similar technique to conceal the details of their discoveries, while at the same time being able to claim they were the first to make such discoveries.

In the rest of the book, you’ll see how this idea is used to secure decentralized applications.

Congratulations, you’ve completed the Cryptography 101 course! I hope it wasn’t too painful. You now have the necessary tools to understand how a blockchain works. Now we’ll enter the blockchain.

Blockchain

A blockchain is a distributed database that holds records called blocks. Figure 2.14 illustrates the structure of a typical blockchain.

Figure 2.14. A blockchain is a sequence of blocks, each containing a sequence number, a timestamp, and a list of transactions, each individually digitally signed. Each block also references the cryptographic hash of the previous block.

A block includes a list of transactions, which are digitally signed to prove their provenance. Most blockchains digitally sign transactions with an elliptic curve digital signature algorithm (ECDSA), based on elliptic-curve cryptography, rather than a traditional digital signature algorithm (DSA), because ECDSA is harder to break and uses smaller keys to guarantee the same level of security. Each block contains a timestamp and a link to a previous block based on its cryptographic hash. It also contains a cryptographic hash summarizing the full content of the block, including the hash of the previous block. In this way, the blockchain holds both the current state (the latest block) and the full history of all the transactions that have been stored on it since its inception.

This structure guarantees transactions can’t be tampered with or modified. A transaction recorded in a block can’t be altered retroactively because to modify it, the hash of the block containing it would have to be regenerated, and this wouldn’t match the existing one already referenced by subsequent blocks, as shown in figure 2.15.

Figure 2.15. An attempt at altering the contents of a block, for example its transactions, won’t be successful: the new hash generated from the altered block details won’t match the original block’s hash already directly referenced in the next block and indirectly referenced in the subsequent blocks.

Note

If two transactions contradict each other—for instance, each of them tries to transfer all the funds of the same account to a different destination account (known as a “double-spend attack”)—miners will execute only the first one, recognized in the Ethereum network through a globally accessible sequence number. They will reject the second one, and it will never appear on a consolidated block. Satoshi Nakamoto of Bitcoin was the first to solve the double-spend problem. Every blockchain has a solution for it; otherwise, it wouldn’t be viable.

The blockchain structure I’ve described is, in fact, a simplified version of real-world blockchain data structures such as the Merkle tree used by Bitcoin or the Patricia tree used in Ethereum. A blockchain is managed autonomously through a P2P network that facilitates fault tolerance and decentralized consensus by processing all transactions independently on each node. Given these characteristics, blockchains are particularly suitable for recording permanently the history of events. This is useful for identity management, transaction processing, and provenance tracking, to name a few use cases.

Mining

To encourage the P2P network supporting the blockchain to process its transactions continuously, active processing nodes, also called mining nodes or miners, are rewarded for the computational resources provided, and indirectly to cover the associated electricity costs, through the consensus mechanism. Every few seconds, one successful miner is entitled to generate and keep a certain number of tokens of the cryptocurrency supported by the platform. Such cryptocurrency has economic value, as it can be used to purchase services on the network, but it also can be exchanged for conventional currencies, such as dollars, yen, euros, and so forth. In the case of the Bitcoin blockchain, they’ll be given several Bitcoin tokens, worth around $2,000 each at the time of writing. The tokens given by the Ethereum blockchain are called Ether, and they’re worth around $200 each at the time of writing. Let’s now look at how the consensus mechanism works.

Consensus

Consensus is, as I mentioned earlier, the mechanism by which participant nodes of the network agree on the outcome of a transaction. In the consensus definition I presented at the beginning of chapter 1, I also emphasized that consensus is distributed, because it’s determined by many participants, and trustless, because the participants don’t need to trust each other. In fact, consensus isn’t reached on individual transactions but on new blockchain blocks. Each participant verifies independently that a new block is valid and, if satisfied, propagates it further to the rest of the network.

What happens in practice is that if most participants have accepted the block as valid and it has propagated successfully throughout the network, miners will use such a block as the latest valid block, and the rest of the blockchain will be built on it. If a malicious miner appended an incorrect block to the blockchain and it propagated to its peer nodes, these nodes would reject the new block, and the malicious chain would die out immediately. The same would happen if a full node tried to modify a block before propagating it to its peers.

As you can see, the key step of the consensus mechanism is the verification of the latest block by a participant node. After verifying the digital signature of the individual transactions present on a block, a participant node verifies that the hash of the block is valid. Such hash is produced by miners according to an agreed protocol. The earlier versions of Ethereum used an algorithm called Ethash, based on a Proof of Work protocol. Future versions will be based on a Full of Stake protocol called Casper. I’ll explain both protocols.

Proof of Work

As you saw earlier, a block contains a cryptographic hash summarizing the full content of the block, including its metadata and transactions data, and an additional piece of data of a fixed length, such as 32 bits, called nonce. The objective of the Proof of Work (PoW) protocol is that miners must find a nonce such that the hash generated fits a certain constraint, for instance, having many leading zeros. Constraining a 64-bit unsigned integer hash to have 13 leading zeros when represented in hexadecimal format, as in the example of figure 2.16, reduces the number of valid hashes from the theoretical maximum number of 18,446,744,073,709,551,615 to 4,095.

Figure 2.16. Proof of Work: generation of an unsuccessful and a successful block hash

Because of the properties of hash functions you saw earlier, the only way a miner can find such a nonce is by trying many possible values until the constraint on the hash has been met. In the example I just gave, every such attempt will only have a roughly 0.00000000000002% chance of being successful. When a satisfactory hash has been found, the miner is entitled to append the new block being processed to the blockchain and claim the token reward. As you can understand, this way of producing a valid hash is CPU-intensive, energy-consuming, and, consequently, economically expensive. The main reason for such an expensive algorithm is to dissuade malicious participants from appending new incorrect blocks or modifying preexisting blocks and making them look like genuine blocks. The amount of energy (and money) necessary to perform such actions would make them unviable. In the sidebar in section 3.3.4, I’ll give you an idea of the hardware most miners use.

Proof of Stake

Proof of Work, which the Bitcoin network also uses, has been widely criticized for the immense amount of energy consumed (or rather, wasted?) by the competing miners. It has been estimated that the Bitcoin network alone will consume as much electricity as Bulgaria by 2020.

To tackle this problem, Vitalik Buterin, one of the Ethereum founders, has proposed an alternative approach based on a Proof of Stake. This is based on a pool of validators that vote on the validity of new blockchain blocks. To join the validator pool, which is open to anyone, a node must commit an Ether deposit that will be held until the node leaves the pool. Votes expressed by each node are weighted on the amount of the deposit committed (which equates to the stake of a node in the pool). Under this scheme, a validator profits from transaction fees that the transaction senders pay. If a validator cheats, the associated Ether deposit is deleted from the network and the owner is banned from rejoining, which acts as a deterrent against manipulation.

You’ve now covered all the general cryptographic techniques underlying blockchain databases. If you’d like to learn more about the subject, I encourage you to read Grok-king Bitcoin by Kalle Rosenbaum) (Manning, 2019). Let’s now examine more recent technologies that have simplified Dapp development.

The Merkle tree and Merkle root

The blockchain structure I’ve shown in the previous diagrams is a simplified representation of a real one. Generally, a miner places in the block two parts: a header and a body, as shown in figure 2.17. The body contains all the transactions included in the block. The header contains the block metadata you saw earlier, such as the block number, timestamp, previous block hash, and PoW nonce. It also contains the Merkle root of the transactions Merkle tree that the miner calculates.

Figure 2.17. The structure of a block, including a header containing metadata, such as the block number, timestamp, previous block hash, and Merkle root of the transactions Merkle tree, and a body containing the transactions collection

The transactions Merkle tree, as shown in figure 2.18, is a tree structure built as follows:

  • The block’s transactions are placed at the bottom of the tree, arranged in pairs.
  • Each transaction is hashed, and each of these hashes becomes a leaf of the Merkle tree.
  • A hash is calculated for each pair of contiguous hashes.
  • The hashing of contiguous hashes is repeated until only two hashes remain. The hash of these two final hashes is the Merkle root.

Figure 2.18. A Merkle tree. Individual transactions are located at the bottom; the tree’s leaves are the hashes of the individual transactions; and the next row up is made of the hashes of the tree’s leaves. The top row, which is the hash of the hashes below, ends the tree: this is the Merkle root.

The Merkle root is therefore a single hash summarizing all of the transactions contained in the block in a way that guarantees their integrity. The advantage of having the Merkle root on the block header is that a client can synchronize the blockchain in a faster way by retrieving the block headers, rather than the entire transaction history, from the network peers. This is generally called light synchronization.

2.2.2. Ethereum technologies

Although smart contracts can be implemented, with some difficulty, on early blockchain systems such as Bitcoin, they can be more easily written and executed on later blockchain platforms, such as Hyperledger, Nxt, and Ethereum, that have been designed with the main purpose of simplifying their development. For this reason, later blockchain platforms are considered part of the so-called smart blockchain or blockchain 2.0 wave. Let’s now examine briefly the main innovations that Ethereum has introduced: an improved blockchain design, the EVM, and smart contracts.

The Ethereum blockchain

In the previous section, you learned about the blockchain and a more efficient structure that allows quicker client synchronization based on a block header containing a block’s metadata and a body containing the transactions. The Ethereum blockchain improves the design further. First of all, transactions are hashed in a more compact and efficient (yet still cryptographically authenticated) structure called a Merkle-Patricia trie (see sidebar for more details). Secondly, the block header (generated as usual by the miner) also contains, in addition to the Merkle-Patricia root of the transactions, the Merkle-Patricia root of the receipts, which are the transaction outputs, and the Merkle-Patricia root of the current blockchain state, as shown in figure 2.19.

Figure 2.19. An Ethereum improved block header. The header of a block of the Ethereum blockchain contains the root of the transactions Merkle- Patricia trie, which is a more compact and efficient structure than a Merkle tree. In addition, it contains the Merkle-Patricia root of receipts (which are the transactions effects) and the blockchain state.

As Vitalik Buterin explained in his blog post “Merkling in Ethereum,”[1] with these three Merkle-Patricia tries, a client can efficiently check, in a verifiable way, the following:

1

See Vitalik Buterin, “Merkling in Ethereum,” Ethereum Blog, November 15, 2015, http://mng.bz/QQYe.

  • Whether a certain transaction is included in a certain block
  • What the output of a transaction would be
  • Whether an account exists
  • What the balance of an account is
The Merkle-Patricia trie

A trie[2] (or prefix tree) is an ordered data structure you use to store a dynamic set, where the keys are usually strings. The root of a trie is an empty string, and then all the descendants of a node have the common prefix of the string associated with that node, as you can see in the figure.

2

See the “Trie” Wikipedia page at https://en.wikipedia.org/wiki/Trie for more information on this data structure.

Trie structure (Credit: Booyabazooka (based on a PNG image by Deco). Modifications by Superm401. - own work (based on PNG image by Deco))

The Merkle-Patricia trie is a data structure that combines a trie and Merkle tree. It improves the efficiency of a Merkle tree (named after Ralph Merkle) by storing the node keys using the PATRICIA algorithm (practical algorithm to retrieve information coded in alphanumeric), designed by D. R. Morrison in 1968. You can read about the Patricia algorithm on the Lloyd Allison Algorithm Repository.[3] The Ethereum Merkle-Patricia trie is described in detail, with code examples, in the Ethereum wiki.[4]

3

4

When a full node receives a new block, the transactions contained in the body are processed as follows:

  • The transactions are arranged in a transaction Merkle-Patricia trie specific to the new block.
  • Transactions are executed on the EVM. This action generates transaction receipts, which are arranged in a receipts Merkle-Patricia trie specific to the new block. It also alters the global state trie, of which only one instance exists on each node.

If the roots of the new transaction trie, receipt trie, and modified state trie match those in the header, the block is considered validated. Then the new and altered tries are stored on the full node in a respective key-value store based on LevelDB, an open source NoSQL database developed by Google. Note the following in figure 2.20:

  • The transaction store contains a transaction trie per block, and each trie is immutable. The key of this store is the transaction hash (keccak 256-bit hash).
  • The receipts store contains a transaction trie per block, and each trie is immutable. The key of this store is the hash of the receipts of a transaction (keccak 256-bit hash).
  • The state store contains a single state trie that represents the latest global state and is updated each time a new block is appended to the blockchain. The state trie is account-centric, so the key of this store is the account address (160 bytes).

Figure 2.20. Detailed block processing in an Ethereum node. When a full node receives a new block, it separates the header and the body. It then creates a local transactions trie and a local receipts trie and updates the existing state trie. The new and updated tries are then committed in the respective stores.

A major benefit of the Ethereum blockchain design is that it allows three types of synchronization:

  • Full—Your client downloads the entire blockchain and validates all blocks locally. This is the slowest option, but you’d be confident of the integrity of the local blockchain copy.
  • Fast—Your client downloads the entire blockchain, but validates only the 64 blocks prior to the start of the synchronization and the new ones.
  • Light—Your client retrieves the current state trie of the blockchain from a peer full node and stores it locally. It doesn’t retrieve any historic blocks from peers, and it receives only the new ones, so you don’t have to wait long. This will allow you to get up and running quickly.
Note

Although in this section I’ve covered the physical design of the Ethereum blockchain in some detail because it’s important you understand how transactions and state are maintained, in the rest of the book I’ll use simplified logical diagrams in which I’ll represent a block as a collection of transactions.

Ethereum Virtual Machine

The Ethereum Virtual Machine (EVM) is similar in purpose to the Java Virtual Machine (JVM) or the .NET Common Language Runtime (CLR). It runs on each node of the Ethereum P2P network. It’s Turing complete, which means it can run code of any complexity. It can access blockchain data, both in read and write mode. The EVM executes code only after its digital signature has been verified and constraints based on the current state of the blockchain are satisfied.

Smart contract

A smart contract, or simply contract, encapsulates the logic of a decentralized application. As I mentioned earlier, an Ethereum smart contract is written in a high-level language, such as Solidity or Serpent, and is compiled into EVM bytecode. It gets deployed on each node of the P2P network and is executed on the EVM.

Next generation blockchain

Thanks to the EVM, Ethereum is a programmable blockchain. Therefore, you can develop any type of decentralized application on it, not only cryptocurrencies, as was the case for earlier blockchains. Because of this programmability, Ethereum is considered a generalized or next generation blockchain. Some go as far as thinking smart blockchain platforms will be the foundation of a new generation of the internet, a Web 3.0 (although this exact version of the web is also used by the “semantic web” community), which will be characterized by more empowered users.

2.3. Ethereum’s history and governance

Before closing the chapter, I’d like to share how Ethereum was created and how it evolved after the initial release. In the next few chapters, you’ll start using several components of the Ethereum platform. Before you do so, it’s important you understand how these components came about and what the process is for proposing and agreeing on changes. You’ll realize decentralization isn’t only a technical aspect of Ethereum; it’s almost a philosophy that also permeates its governance.

2.3.1. Who created Ethereum?

Ethereum is the brainchild of Vitalik Buterin, an early follower of bitcoin and cryptocurrency technology since 2011, when he also cofounded Bitcoin magazine. After researching the possibility of generalizing blockchain technology for building any application, in November 2013 he wrote the Ethereum White Paper (https://github.com/ethereum/wiki/wiki/White-Paper), in which he laid out the design of the Ethereum protocol, together with the first details of the smart contract infrastructure. Among the first people who engaged with Vitalik’s vision were Gavin Wood, who contributed to the shaping of the protocol and became the lead developer of the C++ client, and Jeffrey Wilcke, who became the lead developer of the Go client. After only a few months of work, in January 2014, Vitalik announced the Ethereum initiative on bitcointalk[5] and received considerable response. Soon afterwards, in April 2014, Gavin wrote the Ethereum Yellow Paper,[6] which specifies the design of the Ethereum virtual machine. To accelerate the development of the platform, in July 2014 Ethereum raised around $18.4M through an Ether crowdsale, which was legally backed by the Ethereum Foundation, set up in Switzerland only one month earlier with the mission to do the following (quoting the official website at https://www.ethereum.org/foundation):

5

See Vitalik Buterin, “Welcome to the Beginning,” https://bitcointalk.org/index.php?topic=428589.0.

6

See “About the Ethereum Foundation,” https://github.com/ethereum/yellowpaper.

...promote and support Ethereum platform and base layer research, development and education to bring decentralized protocols and tools to the world that empower developers to produce next generation decentralized applications (Dapps), and together build a more globally accessible, more free and more trustworthy Internet.

Table 2.1 summarizes Ethereum’s timeline since its inception to the time of writing of this book.

Table 2.1. Ethereum timeline from inception to summer 2018
Sep 2011 Vitalik Buterin cofounds Bitcoin magazine with Mihai Alisie.
Nov 2013 Vitalik publishes the Ethereum White Paper, presenting the design of the Ethereum protocol and the smart contract infrastructure.
Dec 2013 Gavin Wood contacts Vitalik, and they start detailed design discussions.
Jan 2014 Vitalik makes the official Ethereum announcement on bitcointalk.
Apr 2014 Gavin wood publishes the Ethereum Yellow Paper, which specifies the Ethereum virtual machine (EVM).
Jun 2014 The Ethereum Foundation is set up in Switzerland.
Jul 2014 Ethereum raises $18.4M through an Ether crowdsale.
Aug 2014 Vitalik Buterin, Gavin Wood, and Jeffrey Wilcke set up ETH DEV, a nonprofit organization focused on the development of the core Ethereum protocol and infrastructure, which managed the development of various proofs of concept throughout 2014.
Nov 2014 ETH DEV organizes DEVCON-0, the first Ethereum developer conference, in Berlin, where the entire Ethereum project team meets for the first time.
Jan 2015 The Go Ethereum team meets in Amsterdam, where Whisper Dapp and Mist prototypes are presented.
Jul 2015 Mainnet release 1.0, codenamed Frontier, and the stable beta of Ethereum Wallet are released.
Nov 2015 In London, 400 people attend DEVCON-1, where more than 80 talks on each part of the Ethereum ecosystem are given.
Mar 2016 The project releases Mainnet release 2.0, codenamed Homestead.
Jul 2016 An unplanned Ethereum fork occurs following a DAO attack, and a split takes place between Ethereum and Ethereum Classic. (See the sidebar.)
Oct 2017 The project releases Mainnet release 3.0, codenamed Byzantium.
Jun 2018 The project releases the Proof of Stake (PoS) Testnet release, codenamed Casper.

If you’re interested in knowing more about the history of Ethereum, the official documentation has a page[7] dedicated to it. But you can get a firsthand and more engaging account of the main events that took place around Ethereum’s creation in the blog posts “Cut and Try: Building a Dream,”[8] by Taylor Gerring (a core Ethereum developer), and “A Prehistory of the Ethereum Protocol,”[9] by Vitalik Buterin himself.

7

See “History of Etherum,” Ethereum Homestead, http://mng.bz/XgwM.

8

See Taylor Gerring, “Cut and Try: Building a Dream,” February 9, 2016, http://mng.bz/y1BE.

9

See Vitalik Buterin, “A Prehistory of the Ethereum Protocol,” http://mng.bz/MxRm.

The DAO attack and the split between Ethereum and Ethereum Classic

The DAO (which stands for decentralized autonomous organization) was one of the first mainstream Dapps in the Ethereum space. It was a decentralized venture capital fund. The DAO token holders were meant to vote on all investment decisions. While the DAO smart contract was still being developed, tokens were sold to investors through a crowdsale, a sort of decentralized crowdfunding application. (You’ll read more about crowdsales in chapters 6 and 7.) This funding campaign, which took place in May 2016, managed to collect over 12M Ether, which at the time was worth around $150M (with Ether trading at $11).

One of the features of the DAO contract was that groups of DAO token holders unhappy with decisions that the qualified majority made (investment decisions were approved with 20% of the votes) could split from the main DAO and create their own Child DAO, where they’d start to vote on different investment proposals. In June 2016, this feature, which had been identified by some community members as potentially weak from a security point of view, was exploited by a hacker, who managed to gain control of 3.5M Ether (worth around $50M at that time) through a recursive call that kept withdrawing funds.

Luckily, the Child DAO creation feature required funds to be withheld for 28 days before they could be transferred out to another account, so the hacker couldn’t steal them immediately. This gave the DAO developers and the Ethereum community some time to propose solutions to prevent the theft. Finally, after a failed soft fork of the blockchain that would have blacklisted any transaction coming out of the DAO, the community voted for a hard fork, including a smart contract designed to return the stolen funds to the original owners. Although the majority had voted for the hard fork, some members of the community argued that the hard fork had broken various principles of the Ethereum white paper, mainly the promise that smart contract code is implicitly law and the guarantee that the blockchain is immutable. They consequently decided to keep the original blockchain running, and this was renamed Ethereum Classic.

You can find many articles and blog posts on the DAO attack, ranging in complexity from the technical to the high-level. Given that you don’t yet have a strong technical foundation in this area, if you want to learn more about this, I recommend you have a look at “The DAO, The Hack, The Soft Fork and The Hard Fork,”[10] which describes in detail what happened without getting too much into the technical side. You’ll be able to understand the DAO attack better after having read chapter 15 on security, but I cover it specifically because most of the techniques used are beyond the scope of this book. Nevertheless, if at that point you’re eager to jump to the technical details of the hack, I recommend the brilliant “Analysis of the DAO Exploit.”[11]

10

See Antonio Madeira, “The DAO, The Hack, The Soft Fork and The Hard Fork,” July 26, 2016, http://mng.bz/a7NY.

11

See Phil Daian, “Analysis of the DAO exploit,” Hacking, Distributed, June 18, 2016, http://mng.bz/gNrn.

2.3.2. Who controls Ethereum’s development?

After the Frontier release back in July 2015, the hot topic of Ethereum governance started to gather momentum within the Ethereum Foundation, as well as across the wider Ethereum community. Key questions, such as “Who controls Ethereum’s development,” “How do changes get proposed,” and “Who approves them and how” got debated openly so that early adopters could be encouraged to use and trust the platform.

Blockchain governance is about the rules and processes that participants must follow for making changes to the platform, and about how the rules and processes themselves should get defined. In short, it’s about who decides on changes and how the decisions get approved and followed.

ETH DEV, the nonprofit organization leading Ethereum development, gathers proposals in the Ethereum Improvement Proposals (EIPs, https://eips.ethereum.org/) repository.[12] This is based on established processes also followed by other open source projects—Python Improvement Proposals (PIPs) and Bitcoin Improvement Proposals (BIPs) are classic examples. Proposals are initially studied and Proofs of Concept (PoCs) often follow.

12

See the EIPs page on GitHub at https://github.com/ethereum/EIPs.

If a proposal gathers enough momentum (it’s considered interesting by the core Ethereum developers), it progresses to Draft status and might be debated further among the wider community at developer conferences or official online forums. If an informal consensus is reached, the proposal can progress immediately to Accepted or Rejected status. Accepted proposals get scheduled for future platform releases, and more effort is consequently put into them. Obviously, there’s always the risk that the participants won’t all agree with the proposal or its implementation, so the proposal is considered implicitly accepted only after most participants have adopted it.

Occasionally, some proposals cause heated debate in the wider community. In those cases, the decision isn’t clear-cut, and they go through formal on-chain voting. When it comes to on-chain voting, one of the following two competing models is generally followed:

  • Loosely coupled on-chain voting (aka informal governance)—The community leaders (for example the Ethereum Foundation and ETH DEV) signal how to vote. Participants vote on-chain through a dedicated smart contract that weights their preference based on how much Ether they own. (This voting is often referred to as coinvoting.) The proposal is then implemented if the outcome of the voting is favorable. Although the vote is ethically binding, developers or other key participants, such as miners, might always decide not to implement or adopt the winning proposal, at risk of being stigmatized by the community.
  • Tightly coupled on-chain voting (a.k.a on-chain governance)—The proposal gets fully implemented before the vote takes place, generally by a group of developers backing it, and then a smart contract switches on the functionality in the production network only following successful on-chain voting. This model is often favored by purists, who argue the technical analysis shouldn’t be influenced by politics until the last stage.

Tightly coupled on-chain voting has been introduced in various blockchain platforms and has become somewhat fashionable. But like other established blockchain platforms, such as Bitcoin and Zcash, Ethereum tends to follow the principle of loosely coupled voting, openly supported by Vitalik Buterin in his blog post “Notes on Blockchain Governance.”[13]

13

See Vitalik Buterin, “Notes on Blockchain Governance,” December 17, 2017, https://vitalik.ca/general/2017/12/17/voting.html.

As you can see, Ethereum governance is relatively informal and centralized, as core developers seem to have more decision-making weight than the wider community. The argument is that if everything went through formal voting, the platform would evolve too slowly. If you’re interested in reading more on Ethereum governance, I recommend the following articles:

  • “Ethereum Is Throwing Out the Crypto Governance Playbook”[14]

    14

    See Rachel Rose O’Leary, “Ethereum Is Throwing Out the Crypto Governance Playbook,” Coindesk, March 14, 2018, http://mng.bz/edwZ.

  • “Experimental Voting Effort Aims to Break Ethereum Governance Gridlock”[15]

    15

    See Rachel Rose O’Leary, “Experimental Voting Effort Aims to Break Ethereum Governance Gridlock,” Coindesk, May 23, 2018, http://mng.bz/pgQ0.

  • “A user’s perspective and introduction to blockchain governance”[16]

    16

    See Richard Red, “A user’s perspective and introduction to blockchain governance,” Medium, http://mng.bz/O2VO.

Summary

  • An Ethereum node hosts an Ethereum client and a copy of the blockchain.
  • An Ethereum client contains

    • a virtual machine called Ethereum Virtual Machine (EVM), capable of executing smart contract bytecode
    • a memory pool, where transactions received by the node get stored before being propagated further into the network
    • a JSON-RPC API, which exposes the functionality of the client to other nodes and external users
    • a client process, which coordinates the processing
  • An Ethereum smart contract is code written in the Solidity language and compiled into EVM bytecode.
  • An Ethereum smart contract is deployed across the P2P network through a contract deployment transaction, pushed to a local Ethereum node, and then propagated throughout the network.
  • A blockchain is a sequence of blocks, each containing a sequence number, a timestamp, and a list of transactions, each individually digitally signed. Each block includes a copy of the cryptographic hash of the previous block and the nonce, which generates the hash of the current block.
  • The main innovation introduced by Ethereum with respect to previous blockchain implementations is the EVM and the concept of the smart contract.
  • Ethereum follows an informal governance model, where proposals go through the Ethereum Improvement Proposals (EIPs) process: they get analyzed by the core Ethereum developers, are often tried through Proofs of Concept (PoCs), and ultimately get accepted or rejected.
  • Occasionally, when an EIP causes heated debate, it gets formally voted on-chain by the participants, but, even if the vote is favorable, a proposal is considered practically accepted only when the majority of the participants have adopted it.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset