Chapter 1. Blockchain Concepts

Fundamentally, a blockchain is a data structure. It is a linked list, or chain, of unique “blocks.” Each block points to the previous one, and is itself a list of transactions. On top of this relatively simple list-of-lists data structure is laid the key innovation that blockchains have given us: a protocol for how blocks are added to the chain without any central authority. Figure 1-1 illustrates this structure.

The cryptocurrencies that have come along with blockchains are a means to an end, providing incentives for people to run software that secures the network. For the first time in history, we have the ability to share digital information without trusting any person, government, organization, or corporation to facilitate the interaction. Ethereum provides a cryptographically secure platform for storing, updating, and removing data from a blockchain using what is referred to as “smart contracts.” We are still in the early days of learning how to use smart contracts to improve things in the real world, and it’s hard to predict how the technology will be used in the future. Similar to the “World Wide Web” in the 1990s, there has been a global influx of inspiring and creative problem-solvers who are working every day to deploy decentralized applications (DApps) that they hope will “make a dent in the universe.”

The structure of a blockchain
Figure 1-1. The structure of a blockchain

We will be spending most of this book focused on the programming language Solidity and smart contract development. Solidity is a popular programming language for developing smart contracts, and was designed to run on the Ethereum Virtual Machine (EVM). Solidity wasn’t the first programming language to run on the EVM, and it certainly isn’t the last. Many other languages, such as Vyper, will be written to run on the EVM, attempting to improve on Solidity’s design or provide powerful domain-specific languages. Before we dig into the revolutionary world of smart contracts, we need to lay a conceptual foundation for you to build on. Due to the historically unique nature of running code on a blockchain, it’s critical for developers to have a sufficient understanding of how the pieces fit together underneath the powerful abstractions that Ethereum provides.

A Brief History

The concept of a “blockchain” was born out of the Bitcoin white paper published in 2008 by the pseudonymous Satoshi Nakamoto. Although the term “blockchain” doesn’t actually appear in the paper, the concept was concisely articulated. Transactions of value exchange enter a peer-to-peer network, and are periodically grouped into “blocks,” or lists. When a “block” of transactions is persisted, it is “chained” to the previous block. This append-only data structure and the protocol that constructs it create an immutable record of transactions.

Bitcoin’s launch in early 2009 marked the beginning of public blockchain networks. Since then, countless cryptocurrencies have attempted to build on Bitcoin’s success as a new form of currency. Many early adopters of Bitcoin realized that the properties of a blockchain had applications beyond financial transactions. Communities spawned to try to extend, fork, and build on top of Bitcoin in order to stretch it in directions that were not previously conceived. Ultimately, though, the Bitcoin protocol is purposely constrained and ill-fitted for extension. Blockchain prodigy Vitalik Buterin made the ambitious decision to stop trying to extend Bitcoin and instead create a more general-purpose protocol from scratch. In 2013, Vitalik wrote the Ethereum white paper.

When Ethereum launched in 2015, it quickly became one of the most valuable cryptocurrencies on the planet, second only to Bitcoin. The markets valued Ethereum because it provided a platform for deploying and running smart contracts on a public blockchain. The term “smart contract” was coined by Nick Szabo in 1994. The idea then was that many legal contracts, notaries, and other analog agreements could be enforced nearly automatically using digital protocols and cryptographic signatures. Despite this historical context, the implementation of smart contracts on Ethereum actually feels more like general-purpose programming than anything specific to legal contracts. The virtual machine as defined by the Ethereum protocol is Turing-complete. This means that as long as you can fit your computations within the limitations of a single block, smart contract developers have few other constraints to contend with beyond their own imagination.

How is a blockchain different from technologies you might have worked with before? In the next section, let’s discuss what makes the blockchain unique.

The Character of a Blockchain

Many software developers will have worked with a technology stack that includes 1) a native mobile user interface and/or web user interface 2) with a server-side programming language that ultimately interacts with 3) a database. In the most basic versions of these systems, the interactions with the database are essentially instantaneous and permanent.

Like a more typical database, blockchains can store arbitrary data, but they share few other similarities. Becoming a competent smart contract developer means understanding the character of a blockchain. One cannot simply treat Solidity like a server-side programming language; you will quickly find yourself lost and frustrated. Unlike typical databases, interactions with even the most basic smart contract systems are not instantaneous and are not guaranteed to be permanent.

Unlike a typical database, which is a single program running on one computer, a blockchain is typically made up of many nodes in a worldwide network. When we refer to “nodes” in the context of a blockchain, we are referring to software that someone has installed on a computer and connected to the blockchain network. Just like there are many different software implementations of the HTTP protocol (Apache, NGINX) called “web servers,” there are many software implementations of the Ethereum protocol (geth, Parity) called Ethereum nodes.

Next, let’s discuss how we can use a network to connect with a blockchain.

Decentralized Networks

In order to work with a typical database, we need a database connection and sufficient rights to update the database. In the simplest cases there’s a single database to contend with, so we need a single IP address to make the connection. The entire system relies on that database being available. If the database were to crash or lose connectivity, or if our rights were revoked, the application stops working. This is what is referred to as a centralized system.

Blockchains were designed to run over decentralized networks. People and companies run nodes in the network, and, in many respects, all nodes are peers. This is possible because every node contains the full history of every transaction that ever occurred on the blockchain. With every node as a self-sufficient, independent entity, there is no center of the network. All nodes validate transactions and propagate them to their peers. On top of this, some nodes also participate in the block creation process, and are financially incentivized to do so via receiving a “block reward” in the blockchain’s native cryptocurrency. Unlike our database example, any node in a decentralized network can join or drop at will. No special permission or rights are needed to read or write to the blockchain as long as the protocol is followed.

Without a single, centralized database to connect to, how do nodes find a connection to the network? Many smart contract developers punt on this issue and use a service that hides this complexity behind a web API. That is a tradeoff between coupling your system to a third-party service and the complexity of running your own Ethereum node. If you choose to run your own node, it will need to pull itself up by its bootstraps and connect to the public Ethereum network. Every major Ethereum node software has a hardcoded list of “bootnodes.” These are well-known, relatively reliable IP addresses that can be used to create a sufficiently large pool of network peer connections. If for some reason those nodes aren’t available or are no longer trustworthy, the node software can be given a custom list of bootnodes to use.

In 2019, the global, public Ethereum network had more nodes than any other blockchain network. With over 12,000 nodes, it’s unlikely that any two nodes will have the exact same set of peers. As a transaction enters the network through a single peer, it spreads quickly around the world to every node. Similarly, when a new block is added to the blockchain, news of this addition spreads quickly. Unfortunately, quickly is not the same as instantaneously, meaning that when two blocks are created at the same time by two different nodes, the respective peers of those diverging nodes will be operating on two different versions of the blockchain. This is called a temporary fork. In general, temporary forks are resolved by granting precedence to whichever fork adds the next block, summarized as “longest chain wins.” These rules are determined by a blockchain’s consensus protocol.

Consensus Protocols

Without a central source of truth, the nodes in a blockchain network need a way to come to consensus on the state of the system. Consensus protocols are how this is accomplished, and are an active area of research. A consensus protocol is simply a system of agreement across a blockchain. There are three protocols that we need to cover, and we’ll start with Proof-of-Work, which Ethereum used when they launched their blockchains.

Proof-of-Work

When you hear about “mining ether” or “cryptomining,” this terminology is due to the nature of the Proof-of-Work (PoW) protocol. Like a miner in search of gold, the PoW protocol for creating a block requires considerable effort, and the more effort you expend, the more likely you are to “mine a block.” Also similar to miners in search of gold, PoW “mining” is competitive. Each Ethereum block contains a block reward in Ethereum’s currency, known as “ether.” This reward is granted to the miner who successfully adds a valid block to the blockchain before any other miners. The miner succeeds by successively “hashing” the block data in an attempt to find a cryptographic hash that has specific characteristics. The protocol determines the rarity of these “hashes.” This rarity is called the “difficulty,” the number that determines the relative effort that miners have to expend to “mine a block.”

Proof-of-Work uses a difficulty variable to maintain a stable block time as miners come and go, adding to or subtracting from the overall mining power (or hashrate) of the network. For example, if the Ethereum network had a cumulative hashrate of 100 tera hashes per second, and a large mining pool went offline, lowering the cumulative hashrate to 90 TH/s, blocks would suddenly take, on average, 10% longer to mine. Ethereum’s protocol adjusts the difficulty to ensure that the correct block time is targeted. In this case, the protocol would adjust by making finding blocks about 10% easier. This process means that in the event of significant changes in the network’s hashrate, the block times will be affected. Additionally, due to the brute-force nature of the mining process, randomness will be an ever-present source of differences in block times, even if the network hashrate stays constant.

Proof-of-Stake

Proof-of-Stake (PoS) has always been on Ethereum’s roadmap. If designed correctly, PoS has significant benefits over PoW. First, there is no need to burn large amounts of electricity to find a valid block. As the respective names suggest, there is no “work” to do, only “stake” to lose. Second, the downside of being a bad actor in a PoS system can be much more severe than in PoW. Block creators stake ether in order to participate, and if they are proven to be operating maliciously, that stake can be taken away (“slashed”). In PoW, malicious operators can be ignored, but we can’t take away their hardware. That asset lives to corrupt another day. Finally, there are less economies of scale in PoS. When a PoW miner earns a block reward, they can use that money to achieve non-linear advantages over other miners, such as faster network connectivity. In PoS, while people with more stakes will earn more ether, that advantage is linear. The rich still get richer, but not exponentially so. That said, in PoW in order to obtain currency, you can buy hardware and connect to the network, whereas in PoS you must buy from existing currency holders, which gives newcomers a disadvantage.

For the smart contract developer, the behavior of PoS is similar to PoW. We still need to determine when we consider transaction finality, since block creators (called “proposers” in PoS) can still end up on different forks due to network partitions. Where things will get significantly different is when Ethereum transitions to a sharded blockchain.

Under the project name “Serenity,” Ethereum is introducing Proof-of-Stake, sharding, and several other improvements as a way of resolving Ethereum’s fundamental scaling issues.1 Serenity will also replace the existing EVM with eWASM (Ethereum flavored WebAssembly). None of these initiatives should have a significant effect on the Solidity language. The one significant new complication will be inter-contract communication. Currently, each node has every Ethereum smart contract ever deployed, so it is fairly straightforward for nodes to interact. But once we have a sharded blockchain, contracts will be spread across 1000+ separate blockchains. The protocol for how these contracts will interact is still being developed, but it’s safe to say that smart contract developers will no longer be able to assume that any arbitrary contract will be synchronously available to them. Furthermore, every Ethereum address will need to have a shard ID associated with it in order to know where its state is stored.

Proof-of-Authority

In some situations, it may make sense to use a blockchain while restricting block creation to only certain entities. These situations cut against the typical use case of having it be open to the public. This sort of restricted block creator status is provided by the Proof-of-Authority (PoA) protocol. PoA is commonly used for private blockchains that are hidden within internal networks. The block creators simply take turns appending the next block at the correct frequency. Due to the round-robin nature of PoA, temporary forks are far less likely.

There are many more consensus protocols, but they are beyond the scope of this book. This is an active area of research, so expect to see new consensus protocols emerging for years to come.

Regardless of which protocol is used, there is a reward for any node that adds a new block to the blockchain. Block creators receive a block reward as well as the sum of all transaction fees in the block. In Ethereum, transaction fees are referred to as “gas.” We will dig more deeply into the topic of gas later in this chapter.

For the purposes of this book, we don’t need to understand the mechanics of consensus protocols. It’s only important to know the role these protocols play within a blockchain, and understand the differences between the most popular protocols. Smart contract developers can essentially treat consensus protocols like a black box. We need to learn each box’s external behavior, but the internals have no effect on our work.

We will now work to understand how transactions are eventually added to a blockchain.

Transaction Processing

In 2019, a new block of transactions was added to the Ethereum blockchain approximately every 13 seconds, on average. That said, a sample of block times across a 65-hour span (between blocks 8,662,243 and 8,679,755) contained a block that took over 2 minutes to be written to the chain (block 8,674,540) and others that took just 1 second. This wide variation in block times is due to the random nature of Proof-of-Work. As Ethereum transitions to Proof-of-Stake, we expect to see block times and block time variability decrease.

While block times are regulated by the protocol, the actual time that a transaction takes, from the moment it is first broadcast on the network to its execution in a block, can vary from a few seconds to a few hours. This variability is due to the limitations of the current Ethereum protocol combined with its popularity. For example, if Ethereum can process around 25 transactions per second, but there are over 30 transactions entering the network every second, then there will be transactions that will remain unexecuted until network demand slows down to under 25 transactions per second. These pending transactions are held in-memory by each Ethereum node in what is called a “mempool.” The most reliable way to quickly get a transaction out of the mempool and into a block is to pay more fees to the block creator. In Ethereum, these fees are called “gas,” and as a smart contract developer, you’ll have to frequently consider “gas prices.” See more on the topic of gas and gas prices in Figure 1-2.

Mempool and Gas priority
Figure 1-2. In step 1, just 20 transactions/second are broadcasted on the network. Assuming the network can currently handle 25 transactions/second, all transactions are mined into the next block. In step 2, 40 transactions/second are broadcasted and the network falls behind, creating a mempool of pending transactions. In step 3, few transactions are sent—just 10 per second—and the mempool drains as the network catches up. The fewer transactions in step 3 would have likely been caused by rising gas prices.

If the same account has two transactions in the mempool, the block creators know which transaction to choose first based on its “nonce,” an account-specific counter that is incremented with each transaction. This means that while the mempool is generally a heap of transactions that block creators can choose from, it’s actually a heap of queues in the cases where accounts have sent multiple transactions before seeing them succeed. Figure 1-3 shows how each account’s transactions are sequenced by its nonce.

Transaction nonces
Figure 1-3. Alice’s current nonce is 1, Bob’s is 0, and Charlie’s is 2

Regardless of whether a transaction is delayed by the nature of the protocol or due to network congestion, as a developer, you need to consider the unpredictable and asynchronous nature of your users’ experiences with your software system. In a world where financial, commerce, and social media software have been tuned to accept transactions in less than a second, systems backed by smart contracts have significant user experience challenges. The next section digs a level deeper into these challenges.

Transaction Finality

When you successfully execute a transaction in a typical database, developers assume that its effects will never be rolled back. In the case of blockchains, though, it’s possible that a user could see a successful transaction included in a block, only to see that block promptly orphaned and replaced with a different block. This block may not have included the previously successful transaction! This lack of finality with blockchain transactions is due to the decentralized nature of the network. It’s possible for two different nodes to create blocks nearly simultaneously, and for their respective peers to be on different “forks” of the blockchain, as mentioned earlier. Whichever chain creates the next block first will beat the other fork, and previously successful transactions can disappear in the process. From the user’s standpoint, it appears that their once successful transaction gets reverted. Figures 1-4 through 1-6 illustrate the issue.

Temporary Fork stage 1
Figure 1-4. Illustrating a temporary fork, step 1: all four nodes have the same block 53 and consensus with each other.
Temporary Fork stage 2
Figure 1-5. Illustrating a temporary fork, step 2: nodes B and C have forked from each other due to mining (or receiving) two different block 54s simultaneously. Node D has received Node C’s block 54, while Node A has received Node B’s block 54. There are now two divergent blockchains.
Temporary Fork from a transaction's perspective
Figure 1-6. Illustrating a temporary fork, step 3: note that the competing blocks of a temporary fork can contain different transactions. Once this fork is resolved, either TXw or TXz will no longer exist on the blockchain.

This problem can be solved by waiting for additional blocks to be added to the chain before considering the transaction settled. Each application will need to have its own threshold for considering a transaction final. Popular cryptocurrency exchanges will wait for up to 50 blocks for a transaction to be treated as “confirmed” on the Ethereum network. When the stakes are lower, it’s less necessary to wait that long. Each application should consider how long it wants to wait to consider transactions final. In some cases, this “confirmation” will be built into the user experience, and in others, it may just be a general warning that not all transactions are final.

Now that we’ve covered temporary forks, let’s talk about their more permanent cousins.

Hard Forks

Each node in a blockchain network must run software that is compatible with the protocol in order to participate in the propagation of blocks and transactions as well as the creation of new blocks. The community of blockchain node developers, protocol developers, block creators, exchanges, and application developers all weigh in on how the protocol should best evolve to suit the needs of the community. Ultimately, this evolution happens through developers changing the blockchain node software, and the people who run the node software (what we call “node runners”) choosing whether to use the new version or not. Sometimes, the changes made to the protocol are so significant that the new version is not compatible with previous versions of the software. This situation is called a “hard fork” because a new blockchain will spawn from the old, creating a fork in the road for node runners, who will have to decide for themselves which fork they will support.

Hard forks can be contentious or non-contentious. Non-contentious hard forks happen when the entire community upgrades, effectively abandoning the old fork for the new one. This is typical when there is a serious defect discovered in the protocol. Contentious hard forks happen when a critical mass of node runners decide to continue using the existing fork and to evolve the old protocol separately from the new version. Ethereum (ETH) had a contentious hard fork in 2016 that resulted in the previous blockchain surviving and calling itself Ethereum Classic (ETC). There could be additional contentious hard forks as the community implements the various components of the Serenity release.

We’ve been focused mainly on general aspects of blockchains; now, let’s focus in on Ethereum specifically.

Ethereum Fundamentals

We have been describing relatively general characteristics of blockchains, with some Ethereum-specific examples. Now we dig into the fundamentals of Ethereum, and discuss how these pieces fit together to enable smart contract development. It is important to understand transaction costs, how accounts and contracts are identified, and how transactions are executed within blocks. These Ethereum fundamentals form the foundation for understanding how to design and develop decentralized applications.

Ether and Gas

The Ethereum protocol has its own currency, called ether. The fundamental use of this currency is to pay block creators to include transactions in blocks. Much like the US dollar, ether is divisible, though to much smaller fractions than a cent. The smallest unit of ether is called a wei, which is a quintillionth of an ether (for perspective, a quintillion is a billion billions, or 1018). Due to a wei’s small size in proportion to ether, you’ll often see ether denominated by Gwei, particularly when it comes to gas prices. A Gwei is a billion wei, and a billion Gwei is one ether.

While there is a single Ethereum protocol, there is more than one network running that protocol. It’s possible to create private networks running Ethereum, similar to how a private “internet” is called an “intranet.” These private networks still deal in ether just like the public Ethereum network, but their ether isn’t worth anything on the open market. The ether on the public Ethereum network is known as ETH and it has real-world value. The public Ethereum network is referred to as “mainnet” by developers. There are also public test networks or “testnets” that the community uses as staging environments. These test networks typically have “faucets,” or mechanisms to give developers free ether in order to test their smart contracts. Every network running the Ethereum protocol has ether, but mainnet’s ETH is the one that has actual value. To deploy and execute smart contracts on mainnet, we need an account that holds ETH. To acquire ETH you will need to purchase it via an exchange, receive it from a friend, earn it from a business, or mine it yourself.

Smart contract code, such as Solidity, is compiled to bytecode, which provides a series of opcodes to the EVM. An opcode is an instruction such as PUSH1 or MLOAD that is interpreted by the EVM. Each of these opcodes has an associated “gas cost.”2 Let’s take a look at a simple smart contract, and inspect its bytecode and opcodes. Thankfully, as smart contract developers, we don’t need to understand the contents of the bytecode or opcodes since we can understand the Solidity easily enough. That said, as you progress as a smart contract developer, digging into these lower-level concepts will give you a deeper understanding of what’s possible.

pragma solidity ^0.4.25;

contract Incrementer {
    uint256 public count;

    function addOne() public {
        count++;
    }
}

Compiling this contract using solc results in the following bytecode:

6060604052341561000f57600080fd5b60cb8061001d6000396000f300606060405260043610604
9576000357c0100000000000000000000000000000000000000000000000000000000900463ffff
ffff168063a7916fac14604e578063febb0f7e146060575b600080fd5b3415605857600080fd5b6
05e6086565b005b3415606a57600080fd5b60706099565b60405180828152602001915050604051
80910390f35b6000808154809291906001019190505550565b600054815600a165627a7a7230582
08e9afbffafd387e67b7c38d8239aaa70fde96a805cebfb6f30517dd68e8664be0029

And the opcodes as reported by solc look like this:

PUSH1 0x60 PUSH1 0x40 MSTORE CALLVALUE ISZERO PUSH2 0xF JUMPI PUSH1 0x0 DUP1
REVERT JUMPDEST PUSH1 0xCB DUP1 PUSH2 0x1D PUSH1 0x0 CODECOPY PUSH1 0x0 RETURN
STOP PUSH1 0x60 PUSH1 0x40 MSTORE PUSH1 0x4 CALLDATASIZE LT PUSH1 0x49 JUMPI
PUSH1 0x0 CALLDATALOAD PUSH29 0x100000000000000000000000000000000000000000000000
000000000 SWAP1 DIV PUSH4 0xFFFFFFFF AND DUP1 PUSH4 0xA7916FAC EQ PUSH1 0x4E
JUMPI DUP1 PUSH4 0xFEBB0F7E EQ PUSH1 0x60 JUMPI JUMPDEST PUSH1 0x0 DUP1 REVERT
JUMPDEST CALLVALUE ISZERO PUSH1 0x58 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST PUSH1
0x5E PUSH1 0x86 JUMP JUMPDEST STOP JUMPDEST CALLVALUE ISZERO PUSH1 0x6A JUMPI
PUSH1 0x0 DUP1 REVERT JUMPDEST PUSH1 0x70 PUSH1 0x99 JUMP JUMPDEST PUSH1 0x40
MLOAD DUP1 DUP3 DUP2 MSTORE PUSH1 0x20 ADD SWAP2 POP POP PUSH1 0x40 MLOAD DUP1
SWAP2 SUB SWAP1 RETURN JUMPDEST PUSH1 0x0 DUP1 DUP2 SLOAD DUP1 SWAP3 SWAP2 SWAP1
PUSH1 0x1 ADD SWAP2 SWAP1 POP SSTORE POP JUMP JUMPDEST PUSH1 0x0 SLOAD DUP2 JUMP
STOP LOG1 PUSH6 0x627A7A723058 KECCAK256 DUP15 SWAP11 CREATE2 SELFDESTRUCT 0xaf
0xd3 DUP8 0xe6 PUSH28 0x7C38D8239AAA70FDE96A805CEBFB6F30517DD68E8664BE0029000000

The concept of “gas” exists to decouple the price of ether from Ethereum’s transaction fees. Without this decoupling, Ethereum transaction fees would be pegged to the price of ether, which would have negative impacts on the ecosystem due to the volatility of ETH’s value. To decouple the exchange rate between gas and ether, every Ethereum transaction sets its own gasprice to determine how many wei a single unit of gas costs. As block creators decide which transactions to include in a block, they are incentivized to include the transactions that will give them the most generous gasprice for their computations. At any given moment there is an implicitly understood market rate for gas. Paying below market rate will mean waiting longer than most other transactions for your transaction to be executed, while paying above market rate will allow you to leave the mempool sooner than most.

Ethereum transaction creators need to consider the costs versus benefits of setting their gas price above or below the current market rate. Some transactions are not time-sensitive and can wait many hours to be executed. In this case, it makes sense to pay relatively little ether for a transaction’s execution. Block creators finally include low-gas-price transactions when the mempool is shallow and there isn’t a critical mass of transactions with a generous gasprice. Inversely, some transactions are incredibly time-sensitive, and are so valuable that it makes sense to pay significantly above the market rate for fast block inclusion. There are even people who run software to observe the mempool in real time and try to “front-run” certain financial trading transactions by seeing a pending trade and jumping ahead of it by setting a higher gasprice.

Each Ethereum transaction has to include gas and gasprice attributes, which when multiplied will set the maximum transaction fee for the transaction, denominated in wei. This gas attribute sets a limit to how many computations the transaction can perform. If that limit is reached, the smart contract execution reverts, but the transaction is still written to the blockchain, and the fees are consumed by the block creator. If the smart contract call completes with leftover gas, the gas is returned to the transaction creator. The sum of the gas used across all transactions in a block cannot exceed the block’s specified gaslimit. This also means that no single transaction’s gas usage can exceed the block’s gaslimit.

The next section will dig into the details of the players within a transaction and how they are identified.

Accounts

The most basic Ethereum transactions are just externally owned addresses (EOAs) sending ether to each other. Additionally, Ethereum transactions can be sent from an EOA to a smart contract. Both EOAs and smart contracts are identified by an Ethereum address like the following:

0x52bc44d5378309ee2abf1539bf71de1b7d7be3b5

An address is represented by a hexadecimal number. Given the magnitude of the numbers involved in generating an address, it is practically impossible to generate two identical addresses. There is no way to distinguish an address for a smart contract from an EOA without inspecting the blockchain. While accounts and contracts have undistinguishable addresses, they have some important differences.

Every transaction in the Ethereum blockchain is initiated by an EOA. Smart contracts can’t spontaneously perform an action. They can call other smart contracts, but every transaction originates from an EOA. When contracts are called, they can emit events, store data, receive ether, send ether to EOAs, or send data or ether to other contracts. Besides initiating transactions, EOAs, on the other hand, can only receive ether. They cannot react to any transactions that involve them the way that a smart contract does.

Contracts

Contracts in Solidity are organized in an object-oriented style similar to the Java programming language. In object-oriented parlance, a contract is really a class, or a collection of state variables and functions. In order to reuse common functionality, object-oriented languages allow a class to inherit from another class (or contracts, in this case). Since Solidity uses function as a keyword, we will refer to as “functions” what most object-oriented languages refer to as “methods.” Solidity functions can be separated into two distinct types: write-only and read-only.

Read-only Solidity functions are denoted with the pure and view keywords. These functions can take input data, read contract data, operate on that data, and return data. Read-only functions cannot change the state of the contract or emit events. Because no updates are needed on-chain, read-only functions are instantaneous—they are very similar to web API calls, particularly GET requests. It’s important to note that being able to skip on-chain updates means that read-only functions can be called without paying any gas costs, and there will be no transaction created.

Write-only functions are the default in Solidity so they require no additional keywords. Despite being “write-only,” they can actually return data, but due to the asynchronous nature of Ethereum, the return data is practically useless, hence “write-only.” These functions are the workhorses of Ethereum, and their data must be sent via transaction and included in a block in order for the function to be executed. An unsuccessful write-only method will revert, either because it has run out of gas or reached an invalid EVM state, or because of an explicit statement in the contract, such as failing a require statement. A successful write-only method doesn’t actually have to change anything, but it typically changes something and often emits one or more events in the process.

The purpose of events in Ethereum is generally twofold: to provide a custom historical log of what has occurred in the contract, and to allow observers to subscribe to real-time updates. Due to the nature of the blockchain, we already have a historical ledger of everything that has ever happened, but events are a convenient way to provide more domain-specific logging and updates so that users don’t have to create their own interpretations to transactions and state transitions.

To better understand the mechanics of Ethereum’s smart contracts, let’s now focus on the vehicles of their modification.

Blocks and Transactions

Only Ethereum block creators determine the attributes of a block, such as which transactions are included. Similarly, only Ethereum users determine the attributes of a transaction, such as which contract to send data to. As smart contract developers, we frequently need to be aware of the state of the block as well as the state of the transaction that is currently executing.

Solidity exposes the following transaction (tx) attributes:

gasprice

The gas price (in wei) set by the EOA that created the transaction.

origin

The address of the EOA that created the transaction. Surprisingly, this address is rarely useful and often insecure.

A transaction can have an arbitrary number of contracts involved in its execution, provided that its execution fits within the constraints of the block’s gaslimit. Solidity exposes a number of other transaction-related attributes, but groups them into a message (msg) abstraction. Messages refer to the communication between contracts and anything that can call them, such as other contracts. For example, a contract function call will always have a msg.sender. That msg.sender could be equal to the tx.origin or the creator of the transaction, or it could be the address of an intermediary contract.

Solidity’s message (msg) attributes are as follows:

data

The raw bytes of data that were sent to the currently executed external or public function. This is also referred to as calldata.

sender

The address of the caller of the currently executed external or public function.

sig

The first four bytes of the calldata determine which function is being called. This is also referred to as the function identifier.

value

The amount of wei that is being sent to this function.

Solidity exposes the following block attributes:

number

Each block increments this number. The genesis block was block 0.

timestamp

The time in seconds since epoch that the block was created. You may also see code that uses its alias, now.

blockhash

In addition to its sequential block number, each block is uniquely identified by its hash. A hash is a hexadecimal number such as 0x88e96d4537bea4d9c05d12549​907b32561d3bf31f45aae734cdc119f13406cb6. The hash of the current block is not available, but by providing a block number, you can get the hash of any block within the past 256 blocks.

difficulty

The mining difficulty level of the current block.

gaslimit

The maximum amount of gas that can be consumed by this block. This is set by the block creator.

coinbase

The address of the block creator.

Next, we will address the quirky aspect of how time works in a smart contract.

What Time Is It?

Most programming languages allow developers to check the time, which typically uses the time reported by the computer the program is running on. For instance, this JavaScript program running on Node.js reports on the current time as it is executing:

for (let i = 0; i < 10; i++) {
    console.log(new Date());
}

The output of this program would be something like this:

2019-10-05T05:08:45.058Z
2019-10-05T05:08:45.059Z
2019-10-05T05:08:45.060Z
2019-10-05T05:08:45.060Z
2019-10-05T05:08:45.060Z
2019-10-05T05:08:45.060Z
2019-10-05T05:08:45.060Z
2019-10-05T05:08:45.060Z
2019-10-05T05:08:45.060Z
2019-10-05T05:08:45.060Z

Not surprisingly, you can see the time changing as the program executes. Here is the equivalent code in Solidity:

pragma solidity ^0.5.0;

contract TimeReporter {
    event TimeLog(uint256 time);

    function reportTime() public {
        for (uint8 i = 0; i < 10; i++) {
            emit TimeLog(block.timestamp);
        }
    }
}

The events emitted by this function would all have the same exact time attribute. That time attribute, set by block.timestamp, is the time that the block was added to the blockchain. For every transaction in that block, the block.timestamp attribute will be identical. While the clock on a computer ticks at least once a millisecond, the “clock” on a blockchain only ticks as often as blocks are added to the chain. Due to a blockchain’s low-resolution clock, we can never expect an exact second to occur. When you write code that checks the time, comparisons should always involve greater or less than, rather than exactly equal.

When designing smart contracts, it is also important to keep in mind that that block creators can manipulate the time a block is created as well as the ordering of the transactions to their advantage. For instance, if a smart contract has a built-in deadline, and it would significantly benefit most block creators to have that deadline missed, then they can choose to delay any transactions that would meet the deadline. Block creators manipulating the ordering of transactions to their benefit can come in the form of “front-running” token trades. For example, someone sends a transaction to buy 5 ABC tokens for 1 ETH, due to some new information about ABC tokens. A block creator with significant hash power could delay that 5 ABC / 1 ETH transaction and add their own transaction to a block for a lesser amount and grab those ABC tokens for themselves. This is only possible if the block creator actually succeeds at creating a block, which is highly competitive. So these concerns aren’t a significant vulnerability of smart contracts, but they are important to keep in mind as you consider your designs and incentives.

Finally, let’s consider some of the “crypto” in cryptocurrency.

Signing Transactions

We need to appreciate the reasons why when we “sign” a transaction, we can be sure that signature was created by a specific private key. This comes down to good old public-key cryptography. A private key is the secret counterpart to a public key. The address of an EOA is a truncation of the hash of its public key.3

Mercifully, in the course of developing smart contracts and decentralized applications (DApps), we don’t often work directly with private keys. We strongly recommend to leave private key management and cryptographic signatures to wallet software such as MetaMask. Knowing an EOA’s private key is synonymous with owning that account because the private key is what is used to sign transactions. Without this cryptographic signature, there is no way to authenticate whether a transaction was actually sent by its specified EOA.

When we send an Ethereum transaction using any of the Web3 libraries, the cryptographic signature happens in the background. The following transaction attributes are concatenated, encoded, and then signed with the configured private key:

nonce

Sequence number of this transaction for this EOA.

gasPrice

Amount of wei that this transaction is paying per unit of gas.

gas

Amount of gas that this transaction is willing to spend.

to

Recipient address of this transaction. It could be an EOA or a contract.

value

Amount of wei (if any) that this transaction is sending to the recipient.

data

In the case of a contract call, this contains the function name and all parameters. In the case of a contract deployment, this contains the contract bytecode. If no contracts are involved, this is generally blank.

chainId

Each public Ethereum network has a chainId. Mainnet is 1, the Kovan testnet is 42, etc.

Once those attributes are signed, the signature itself is included in the transaction so that Ethereum nodes can validate that the sender is legitimate. In order to validate this, nodes use the sender’s address to validate the signature. If someone were to try to send a transaction with a bad signature, the nodes would reject it.

Summary

We have touched lightly on many aspects of blockchains and Ethereum in this chapter. We hope that we have whetted your appetite and that you are eager to dig deeper into decentralized application development via our next chapter. The rest of the chapters in Part I of the book are increasingly pragmatic preparations for Part II, where we will begin developing smart contracts and DApps in earnest.

1 Serenity is still under active development at the time of this writing.

2 A mapping of opcodes to gas costs can be found in Appendix G of the Ethereum yellow paper at https://ethereum.github.io/yellowpaper/paper.pdf.

3 For a deep dive into cryptography, read Applied Cryptography by Bruce Schneier (John Wiley & Sons).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset