The distinction between FHE and TEEs: The Downfall attack

Written by Florent Michel, Joseph Wilson

As a company focused on accelerating Homomorphic Encryption, we are often asked questions about the distinction between the security provided by FHE and other technical solutions for protecting data-in-use.

These questions often focus on existing hardware-level protections that are intended to isolate and defend data-in-use from the rest of a computing system. Some degree of hardware-based protection is present in almost every computing system; however, in the context of protecting very sensitive data, this hardware model can be extended to provide isolation of this data from even the system administrators themselves.

This isolation forms the basis of the Trusted Execution Environment, or TEE. Most of the largest CPU manufacturers provide some mechanism for creating a TEE, such as Intel Secure Guard Extensions (SGX), ARM TrustZone, or AMD Secure Encrypted Virtualization (SEV).

While specifics differ, the common thread connecting these approaches lies in the implementation of a secure enclave, a region of encrypted memory. Notionally, even if data or instructions are read from these regions, they would be unintelligible; this information can only be decrypted and used on the fly when within the CPU.

This provides additional protection against many potential threats, but it’s not infallible, and a recently released hardware attack methodology has once again highlighted the challenges that face TEEs.

The short answer to the question regarding the distinction between FHE and TEEs is that only the former offers total security for data when in use within a processing system.

If you’re interested in the long answer, read on. We’ll start by describing how information can be extracted from computing systems without directly attacking the core security measures, and then take a look at some recent examples of when memory-level protections have failed. We then go on to describe the foundational distinction between the protections offered by FHE, and the root of the failure that enables attacks on conventional microprocessing architectures.

Side channel attacks

It’s extremely rare for a properly-implemented and well-designed cryptographic solution to be defeated head-on, by attacking the actual principles on which it is based. For example, AES encryption is considered to be nigh-unbreakable in the context where an attacker doesn’t have the secret key.

However, if an attacker can recover the secret key by, say, monitoring the time taken for your CPU to execute the key generation step, then they’ve found a way of attacking the system that lies outside of the aspects that can be covered by cryptography. We would say that the attacker has successfully used side-channel information, and this kind of attack is known as a side-channel attack.

These attacks can be ingenious, or surprising; we recently saw an attack against keycard-controlled doors that can successfully recover a secret by analysing video of a power LED blinking. The pattern of the blinks reflects the instantaneous power draw of the system, which can give the attackers some clue as to the operations that the system is performing.

The tools attackers have for finding these clues aren’t limited to just LEDs either; past examples of hardware side-channel attacks (to name but a few) include analysing the high-frequency noise generated by systems working on cryptographic algorithms, or even deciphering the subtle differences in sound produced by keystrokes on a phone screen or keyboard. It’s also been noticed that AI is increasingly used to assist attackers in accurately analysing the complex data gathered in the course of such an attack.

Of course, some of these attacks are possible, but may require specific or unrealistic conditions to replicate. In practical terms, the range of side-channel attacks that can be deployed in the field is smaller than the range of all known side channel attacks.

However, where we do find practical side-channel attacks, defending against them is notoriously tricky. Now the attack surface, the range of potential vulnerabilities that we have to consider, is no longer neatly covered cryptography, or the protocols that can be built on top of it. Instead, we have to account for the messy business of computing as a physical phenomenon.

For example, in the case of the AES timing attack, we can prevent this attack by ensuring that our key-generation algorithm runs in constant-time, by which we mean deliberately programming the algorithm such that it takes the same amount of time to run regardless of the specifics of the key.

That’s a relatively straightforward solution that can be executed in software, and most good-quality cryptographic libraries (e.g Libsodium/NaCl) already implement safe behaviours in the vulnerable algorithms.

Addressing hardware problems is more involved; in the case of the power LED attack outlined above, the answer is to apply a capacitor or similar filter to the LED power trace in the circuit. The objective here is to smooth out the power supplied to the LED, and thus limit or prevent the tell-tale flickering.

Fixing the problem in hardware like this might mean that a product line or two is affected, which is expensive; there’s the question of remediation, recalls, refunds and so on. But what happens when a practical side-channel attack hits a system that is in use everywhere, with no practical alternative?

Spectre and Meltdown…

That’s a nightmare scenario that has repeatedly hit the chip-making industry. A few years ago, the Spectre and Meltdown attacks generated international headlines when they revealed hardware vulnerabilities in every Intel processor manufactured since 1995, as well as a quantity of ARM-based processors. AMD-based processors were also found to be vulnerable to the same essential principle behind these attacks.

The technical specifics of these attacks is complex, but the core idea is that Spectre and Meltdown allowed attackers to overcome the isolation boundary model of security that segregates interactions between processes in a CPU.

As applied in an Intel CPU, these boundary layers are described as “protection rings” that describe the conceptual level of control over hardware resources assigned to applications and processes. These rings range from Ring 3 (the lowest level of security, occupied by applications) through to Ring 0, which is reserved for the core OS processing kernel, with the intermediate levels reserved for things such as hardware drivers and virtual machines.

Nominally, this permission-based isolation means that data stored in memory under Ring 0 is protected from malicious applications, which means that it can be used to store sensitive information such as passwords and encryption keys when in use. However, while applications don’t have direct control over memory, they do have the capacity to request resources from processes running in Ring 0. It’s this request mechanism that allowed Spectre and Meltdown to work.

Both attacks exploit a property of CPUs called speculative execution, a technique used to accelerate processing by allowing the CPU to “guess” possible future commands that might be sent to it, execute the outcome of each command, and then retain and act on only the outcome corresponding to the actual instruction that it receives.

By executing these commands before they are actually received, this speeds up the overall execution of program flow control, and is an extremely helpful trick for bridging the gap in latency induced by retrieving information held in slower RAM.

Meltdown and Spectre manipulated this behaviour to leak information from Ring 0 memory. While slightly different with respect to their mechanism of action, the essential idea behind both is that by submitting requests over data held in protected memory, even though the request would eventually be refused, the CPU would still speculatively execute the outcome. This would temporarily place protected information in the CPU cache, which could then be dumped to unprotected memory.

By repeatedly performing this attack, the entire contents of the kernel memory (including passwords and encryption keys) could be extracted to an insecure environment and read by a malicious application. In short, almost every processor in use at the time harboured a critical flaw that would allow attackers access to areas of a computer’s memory that would normally be off-limits, and read the most important secrets.

Fixing this problem was fraught; simply disabling speculative execution would have slowed many workloads to a crawl. Fortunately, updates to the microcode of the processors could be made remotely, preventing the need for what would likely have been the biggest and most expensive product recall in history.

… and now Downfall

Now the curse of speculative execution side-channels has struck again, this time in the form of the Downfall attack.

However, what’s particularly interesting is that the isolation boundaries affected by the attack don’t just include the standard Security Ring model that was violated by the Spectre and Meltdown attacks, but also extend to Intel Secure Guard Extensions (SGX).

In brief, the authors of the Downfall attack determined that the “Gather” instruction can can leak the content of a vector register on the CPU. As this leaks information that is inside the CPU (and thus can’t be retained in an encrypted form), then like the Spectre and Meltdown attacks, this bypasses the in-memory encryption that forms a key part of the security of a secure enclave.

A more detailed explanation on Gather instruction manipulation is provided by Intel themselves, who are releasing a microcode update

Why is FHE different?

The unifying factor in all of these side-channel attacks lies in the fact that in regular computing hardware, data must be decrypted at some point in order to be used. And, as we have repeatedly seen, it is incredibly difficult to ensure that there are no hardware side channels that could be used to leak this decrypted information. The Wikipedia article on SGX contains a list of known attacks on the SGX architecture and their remediations, which goes some way towards highlighting the scale of the challenge.

However, the existence of this list doesn’t preclude the possibility that there are as-yet unknown attacks that could be launched against the principle of a TEE, nor does it rule out the worst-case possibility that malicious actors have discovered a new vulnerability that they are keeping to themselves.

FHE doesn’t face this problem; in fact, in the security model that we’re most concerned about (i.e executing computations in untrusted computing environments), FHE is an effective guard against all hardware-based side-channel attacks.

Not only does the information remain encrypted at all points, including when inside hardware processing environments (thus neutralising the attack vectors pursued by speculative execution attacks, which rely on dumping information which is decrypted on-the-fly within the CPU), but no cryptographic material that could be used to decrypt this information is held anywhere in the untrusted system.

As information is never decrypted, there’s no need to send a decryption key for use in on-the-fly decryption; the only additional cryptographic material needed is a “server” key, which facilitates bootstrapping, as well as the keys or hints required for operations such as ciphertext rotation and relinearisation. None of these pieces of information are sensitive, as they cannot be used to decrypt or otherwise break the security of the encrypted information. In short, there’s no point in the process where an attacker could ever gain access to data protected under FHE.

By removing side-channel attacks in hardware, attackers are instead obliged to directly attack the security of an encryption method backed by decades of research. This approach ensures that data remains protected not only for years, but decades and more.

Table of Contents

Side channel attacks

Spectre and Meltdown…

… and now Downfall

Why is FHE different?