How Ethereum Works Part 2: Smart Contracts, Gas, and Dapps

Why anything that can be programmed can be programmed on Ethereum.

Reading time

10 min. read

Image: How Ethereum Works Part 2: Smart Contracts, Gas, and Dapps

By Mike Goldin

This article is the sequel to How Ethereum Works Part 1, which focuses on understanding Bitcoin. All the background you need to begin understanding Ethereum is there.

Remember the first time you began using objects in a programming language? Or made your first attempt at functional programming? Remember how badly it spun your head around on a conceptual level before becoming intuitive? Developing blockchain orientation is much like developing object and functional orientations: initially disorienting, but eventually obvious. In Part I of this series we learned how blockchains work in general by figuring out the Bitcoin blockchain. In this article, we’ll begin reasoning about the Ethereum blockchain to start developing your blockchain orientation. Nurturing an intuition for how to structure blockchain interactions will pay dividends for a long time to come!

To get started, read the three sections in the Ethereum white paper on accounts, transactions, and messages. Read the rest as well if you like, but if you read “Just Enough Bitcoin For Ethereum,” then you already understand the basic technical underpinnings. Just like when you read the Bitcoin white paper, don’t sweat it if something doesn’t make sense on your first read-through. We’ll get there.

Now You’re Thinking With Contracts

Smart contracts are code that is stored and executed on a blockchain. Add a user interface and smart contracts serve as the backends for decentralized applications, or dapps. With your understanding of the Bitcoin blockchain, it might be useful to think of a Bitcoin transaction as a simple program with three inputs and two outputs (this abuses Bitcoin’s actual native notions of what inputs and outputs are, but don’t sweat it for now). The inputs are a sum of Bitcoin to transfer, an address to transfer from and an address to transfer to. The outputs are the previously specified accounts, each with a new balance denoting the transfer. A mined transaction is a public record that this simple program was executed with some given inputs and produced a set of outputs. In Bitcoin’s case the transfer program is the only one which exists, so every node knows how to verify that the outputs make sense given the inputs.

Ethereum opens the scope of what these programs can be beyond simple transfers of sums to anything which can be programmed on a Turing machine. If you slept through your CS classes, this means that anything that can be programmed, can be programmed on Ethereum.

Ethereum enables this complexity by placing a virtual machine (called the Ethereum Virtual Machine, or EVM) in every node on the network. The EVM is not conceptually different than any other virtual machine. You may already be familiar with the Java Virtual Machine (JVM), for example. Just like JVM code will run on any machine hosting a JVM and produce identical outputs over the same set of inputs, the EVM enables the Ethereum blockchain to reach consensus about the proper output of any EVM code based on a set of inputs.

Full nodes on the Bitcoin blockchain store every transaction made going back to the zero block; full nodes on the Ethereum blockchain additionally store the static code (if any) associated with a given account and that code’s current state in storage.

Imagine a trivial program stored at an account which accepts a number as input, adds that number to a running sum and overwrites the previous sum with the new one. Two accounts have sent transactions to this contract account, the first inputting five and the second inputting two. Stored on the Ethereum blockchain are:

The account and its static code.
The account’s current storage state with its sum set to seven.
A historical account storage state with the sum set to five.
A historical account storage state with the sum set to zero.
Three transaction records: one from when the code was initially stored, one from the account which inputted five and one from the account which inputted two.

Imagine a similar program stored at a separate account (necessarily) which does the same thing, but also stores a linear array of two-field structs (a struct is a template for a structured arrangement of data), each containing an address denoting the transaction sender and the input the sender provided. Two accounts have sent transactions to this contract account, the first inputting five and the second inputting two. Stored on the Ethereum blockchain are:

The account and its static code.
The account’s current storage state with its sum set to seven and an array containing two structs.
A historical account storage state with its sum set to five and an array containing one struct.
A historical account storage state with its sum set to zero and an empty array.
Three transaction records: one from when the code was initially stored, one from the account which inputted five and one from the account which inputted two.

Now we can reconstruct this account’s prior states trivially and see which accounts interacted with it to create those states. This pattern, however, should be avoided. Why? In the example given above, all the data stored in the array could have been reconstructed using the blockchain itself. At this point more devious readers have already imagined several ways to nuke the Ethereum blockchain as described thus far; next we’ll learn about how Ethereum prevents DoS attacks on its nodes’ hard drives and CPUs, and what the measures mean for both developers and users. It means in part that developers need to be judicious in deciding when to write data.

The Price of Gas

What is to stop anybody from uploading a contract with 10 terabytes of static code and exhausting the storage of the network’s full nodes? Or one which spins the CPU continuously to no effect? Ethereum transactions have fees just like Bitcoin transactions to incentivize miners to process the transactions and secure the network, but Ethereum’s fees take the form of a “gas cost.” Just like a car needs so many gallons to travel such a distance, Ethereum transactions require so much Ether to spin so many CPU cycles or store such a quantity of data. By simple virtue of Ether being a scarce and valuable resource, DoS attacks are prevented. A blockchain billionaire looking to burn their fortune on a prank could slow the network for a time, but the winning miner of the nefarious transaction’s block would see quite a payday!

What does this mean for developers and users? While reading from a local copy of the blockchain is free, writing to it and computing with it are not. Storage in particular is expensive, since any data written needs to be stored forever. Spinning the CPU is comparatively cheap. Write operations which change an account’s storage state over what would be considered already allocated memory in a non-blockchain context are always store operations, because historical states are always saved. Ethereum is Turing-complete, so nothing stops you from writing a video encoder and publishing it to the blockchain: you just probably won’t ever be able to afford using it. Assuming the code for such a program is at least several thousand lines, even storing it wouldn’t be trivially cheap. A good heuristic for what Ethereum contracts are practically capable of is: “could it be done on a smartphone from 1999?”

As a developer, this means you need to think hard about your code’s efficiency. Storage efficiency in particular, but every CPU cycle costs your users money. If two contracts do the same thing, the more efficient of the two will win and take all.

So, knowing now both the theoretical possibilities and practical limitations of smart contracts: what makes them so cool?

A Real World Example

Before you can start raving with the rest of us about infrastructureless government and other transformative ideas that smart contracts enable, let’s walk through a simple real-world use case for a smart contract to whet your appetite and get you thinking.

Let’s say my band and I have just finished recording a great new album that we want to share with our fans. The problem is that we’re punks, and The Man is The Man whether his name is iTunes or YouTube. The idea is to print a limited edition of 100 copies of our album on vinyl and register every purchaser to attend an exclusive purchasers-only show at the best dive bar we know. On the Old Internet we might have used something like PayPal to accept payments. PayPal would take a cut of every transaction, we would mail a copy of the album and then hopefully remember to mark off another sale in a spreadsheet such that when the one hundred and first person asked to make a purchase we would say no. The whole proposition is so rickety that it’s no wonder artists and fans alike pay premiums to transact through intermediaries like Ticketmaster and Bandcamp! Luckily, our drummer has some experience writing Ethereum smart contracts, so we decide to code up a simple “registry” to make this all happen.

The registry contract is simple. It is composed of three functions: purchase, provePurchase and claimAlbum. A fan sends the specified amount of Ether to the contract’s purchase function through a web page. If the amount sent is greater than or equal to the price specified, a counter is incremented and the sending account’s Ethereum address is recorded in an array as a struct with two fields: the address and an integer claimed set to 0. This transaction will fail (and refund the fan’s Ether) if incrementing the counter would leave it in excess of 100.

After making the purchase, the fan sends us a (physical) address to which we should mail their vinyl. To do this, the claimAlbum function finds the struct associated with the sending account from purchase-time and increments its claimed field by one. If and only if claimed is equal to one does our web page accept the fan’s address, which we then mail the vinyl to. We’ve made address submission contingent on sending a transaction to the claimAlbum function using the same account that was used to transact with the purchase function, thus ensuring we only accept addresses from people who have actually purchased the album and to only send one copy.

When it comes time for the big show, how can we make sure nobody sneaks in who didn’t pay for our album? This is where we use the provePurchase function. With an iPad at the door, people can sign transactions to the provePurchase function using the accounts they purchased the album with. If those accounts exist in the contract’s storage array, provePurchase will return true and we know they purchased the album. The bouncer waves them through.

What’s more, all of the promises we make about the album’s exclusivity can be verified by the fans before purchasing. Ethereum contracts are content-addressed, meaning their source code hashes to the location the code is addressable at on-chain. By making the contract code available open-source, anybody can independently verify the functionality of the code by hashing the source and looking at the resulting address in the contract.

What’s been described here is a naive implementation with some details omitted, but it should be enough to get you thinking about just how disintermediating smart contracts really are. Ujo Music, for example, has implemented these ideas in a much more robust and foolproof way that allows micro-payments for streams, re-sellable downloads and more at the artist’s discretion. What’s more, the artist keeps 100% of what they earn. People pay Apple 30% of their revenue for the security that blockchains provide nearly for free.

Welcome to Blockchain World

Now we can write code of arbitrary complexity, store it on the blockchain, find it using a hash of its content, and expect it to execute on every node in the network when its functions are called. Consensus on the result of executing code is achieved based on the validity of the code’s state changes in storage. These complex transactions are mined just like transactions in Bitcoin. The code we upload to the blockchain are called contracts, and are effectively the backend for decentralized applications or dapps. The idea of a “smart contract” turns out to be a pretty good metaphor for how code running on Ethereum works, both in terms of its advantages (immutable public records of unbreakable agreements producing a predictable result for any possible input) and limitations (it’s cheaper than lawyers but not completely free).

Get out there and build the future you want to live in!

Subscribe to our newsletter

Get the latest blockchain explainers, tutorials, webinars, resources, and more straight to your inbox.