Background information > TUF for Humans: Explaining software update security

TUF for Humans: Explaining software update security

Rugged started as part of a Drupal Association project to enhance the security of the Drupal.org software repository. Its primary goal was to secure automated software deployments from supply chain attacks. It would do so by implementing The Update Framework (TUF) Specification.

One of the secondary goals of the project was to open-source the resulting code and documentation, to make it easier for other free/libre open source software (FLOSS) projects to adopt the TUF framework. Eventually, we split out the TUF server component, Rugged.

Over the course of the project, we began to recognize that TUF is quite complicated. Even people with lots of experience with security, cryptography and package management often find it difficult to understand. Below, we try to explain some of the basic problems the TUF Spec is trying to solve, and how it suggests that we can address them.

We’re going to cover the following topics:

Software Supply Chains
Validating Signatures
Validating Documents
Software Package Updates
Verifying Packages
Public-key Cryptography
Digital Signatures
Signing Packages

Software Supply Chains

First off, a new type of software security threat has been on the rise: supply chain attacks. In fact, by some estimates, they tripled in the last year alone. Some have been fairly high-profile, such as the recent Log4j and SolarWinds attacks.

But what is a “supply chain attack”?

Here the “supply chain” basically refers to all the software components that are used to build a modern application. Basically, instead of attacking a specific organization’s IT infrastructure, this kind of attack tries to inject malicious code into one of those software components. If it’s successful, then any organization that uses the compromized component is potentially vulnerable.

One approach to address this type of threat systematically is The Update Framework (TUF). In the TUF Specification, the authors outline a host of attacks and weaknesses that they’re trying to address. They then go on to outline a rigourous process to secure software updates against these threats.

Validating Signatures

To help illustrate the nature of the problem space, let’s first consider some real-world examples: chequing accounts and contracts.

When you go to a bank, to open a chequing account, they’ll generally ask you to sign a signature card that they’ll keep on file. When you then write a cheque, the bank will compare your signature on the cheque to the one they have on file for you. They should only transfer money out of your account if they match.

The bank is verifying the authenticity of your signature.

Another common place we find signatures is on contracts. Contracts are most often between two parties, and so have two signatures. On occasion, one of the parties might try to get out of their contractual obligations by claiming that they never signed the contract. If we were to point to their signature on our copy of the contract, they may claim it’s a forgery. The technical term for this is “repudiation”.

How do we overcome this problem? One common approach is to have witnesses also sign the contract. Witnesses aren’t bound by the terms of the contract itself. Instead, their signature is meant to prove that the other parties did, in fact, sign the contract. If a contract dispute were to go to court, the witnesses may be called to testify to that effect.

Each witness is asserting the authenticity of the signatures.

Validating Documents

It is also common to revise a contract’s terms. This often happens repeatedly before finally signing the contract. But it can also happen after a contract was already signed. So now we have multiple versions of a contract.

What happens if the parties can’t agree on which version is authoritative?

To avoid this problem, for really important contracts, a professional notary may be hired. The notary would review all copies of the most recent version of the contract and apply their seal and signature to each of them. The notary also keeps a logbook, where they register each time they notarize a document. These provide proof that all of the copies of the contract are identical.

A notary’s seal is embossed into the paper of each copy of the contract. Only the notary has access to their seal. Older or altered versions of a contract would not have this seal. So they wouldn’t be considered valid.

So the notary’s seal is validating the authenticity and integrity of the document (contract).

These are simplified descriptions of these processes. But they should illustrate how signatures and seals can help to authenticate and verify a document.

Software Package Updates

Let’s see how these principles can be applied to improving software security. Software security is, itself, a complicated subject. So we’ll start by describing a simplified example.

Imagine that I have a simple blog website. Despite its outward simplicity, this website runs atop many, many components. We refer to this as a “software stack”. It’s a simplified version of the “software supply chain” that we looked at earlier. Such a blog website will likely be:

Hosted on a server, probably running a Linux operating system;
Served to your browser by a web server, such as Apache or Nginx;
Written in a language like PHP or Python;
Storing content in a database, such as MariaDB or Postgres;
Built on a web framework like Drupal or Django; and
Using various extensions and modules, to provide specific functionality.

There are many undiscovered bugs and security flaws in the thousands of components that make up this stack. To keep my website secure, I need to promptly update any component that releases a new version that fixes a security flaw. Otherwise, if I leave the insecure code unpatched, an attacker might exploit it.

For the sake of argument, let’s assume that my website was built using Drupal. Luckily, Drupal has a spectacular security team. They publish regular security advisories on a mailing list to which I’m subscribed. So I get alerted whenever I need to update my codebase.

When I receive an alert that a module I’m using has a new security release, I’ll use a tool like Composer to download the latest version of the module and deploy the update to my website. After a couple more steps, my website has been updated, and the bug fixed.

So far, so good… Except, attackers are getting increasingly sophisticated in how they go about compromising website security. What if, when I downloaded the new version of the module, someone managed to intercept the request, and instead delivered a hacked version of the module. This is a “Person-in-the-Middle” (PITM) attack. Now I’m actually worse-off than I was before, and I wouldn’t even know it.

This is an example of what we defined earlier as a “software supply chain attack”. There are a number of similar types of attacks. For example, a “replay” attack is one in which, instead of downloading a hacked version of a module, I instead download an older version. This can be just as detrimental, and might be even harder to detect. But for the sake of simplicity, let’s just use PITM for our example here.

What we need is a way to validate the authenticity, integrity and freshness of the packages that we’re downloading.

This is where TUF comes in.

Verifying Packages

Two components are required to verify package integrity:

A TUF server, to sign packages (more on this below), and
A TUF-enabled client, to verify the package signatures.

Luckily, there’s a Composer plugin available to help with the second part. In our example, during the PITM attack, Composer can (via the TUF plugin) tell that the module has been tampered with. It can then display an error message, so that we’re aware of the problem, and react accordingly.

The Composer TUF plugin basically steps in every time Composer downloads a file, and verifies the file’s integrity, freshness, etc. It does this by downloading information about the file, referred to as “TUF metadata”.

This metadata includes information about the file, such as how big it should be. Most importantly, it contains a “hash” of the file. A hash is a long string of letters and numbers that uniquely identify a specific file. If you change even a single character in the file, then you’ll get a completely different hash.

This hash acts like the notary’s seal. It allows us to easily verify that a given file has not been tampered with.

But attackers are smart. How do we know they haven’t also altered the TUF metadata, to cover their tracks? This is where public-key cryptography and digital signatures come in.

Public-key Cryptography

Here things start to get really tricky. Public-key cryptography relies on some very advanced math, and the way it works is quite unintuitive. As such, we’re only going to try to explain the most important element: asymmetry.

Let’s start by looking at another real-world example: mail boxes. If you want to send me a letter, all you need is my address. However, for anyone to read that letter (once delivered), they’d need the key to my mailbox. Unlike my address, which is publicly available, I’m the only one with a key to my mailbox.

This situation is asymmetric, meaning that it cannot be done in reverse. My mailbox key doesn’t help me to send a letter to you.

Public-key cryptography works in a similar manner. It all starts by generating a “key pair”. A key pair consists of two files each containing a long string of characters (a cryptographic key). However, not just any two keys form a key pair. They need to be generated together, and have a very special relationship.

If we encrypt a message with one of the keys, then it can only be decrypted by the other key in the pair. Notably, the key that was used to encrypt the message won’t help us to decrypt it. This holds true for both keys. Regardless of which one we use to encrypt a message, only the other one can be used to decrypt it.

When we generate a keypair, we designate one of the keys as “public”, meaning it can be shared freely with others. We designate the other key “private”, meaning that it should never be shared at all.

The public key acts like my address, in the example above. It allows anyone to encrypt a message that only I can decrypt, because I’m the only one with access to the private key.

Digital Signatures

Digital signatures are basically the application of this same principle, but in reverse.

Recall that either key can be used to encrypt a message that only the other key can decrypt. So, if I encrypt a message using my private key, then only the public key will be able to decrypt it.

Remember that I’m the only person with access to my private key. So if I send you an encrypted message, and you can decrypt it with my public key, then you can be pretty sure that I’m the one that sent the message.

Of course, this depends on me being responsible, and keeping my private key secure.

We should now be familiar with all the concepts we’ll need to make sense of TUF.

Signing Packages

Earlier, we were concerned about whether an attacker might alter TUF metadata. Digital signatures are the solution.

The TUF server treats the information (metadata) about the files we’re downloading as a message that it encrypts using its private key. Earlier, I left out that the Composer TUF plugin has a copy of the server’s public key. As a result, the plugin can decrypt the message, and thus be confident that it came from the TUF server.

This means that the Composer TUF plugin can confirm the authenticity and integrity of the TUF metadata. Since it can trust the metadata, it can proceed to verify the files themselves.

So a TUF server’s job is to:

Generate metadata to allow a client to verify packages;
Sign this metadata with private keys, so that the client can trust it; and
Keep the private keys secure, so that attackers can’t forge metadata.

Of course, this is all more complicated than what we’ve presented here. If you’re interested, I highly encourage you to read further on the TUF website

Rugged: A TUF server

The rest of the this documentation site goes into more detail about how Rugged fulfills the TUF server requirements. However it takes a more bare-knuckle approach, assuming familiarity with these concepts, or, at least, a willingness to do more research on your own.

That said, we’re always aiming to improve the clarity of this documentation. So don’t hesitate to suggest corrections or revisions.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Rugged TUF Server is a trademark of Consensus Enterprises.