This is very similar to a design Apple announced for iCloud Keychain several years ago at Black Hat.
iCloud Keychain synchronizes keychains across iOS devices, storing their contents encrypted under the user's passphrase on Apple's cloud servers. In theory, Apple has no access to this data, since they don't know the relevant passphrase. In practice, however, passphrases are weak, even under PBKDF2. An attacker that got access to Apple's cloud environment would simply dictionary attack the encrypted blobs, and would probably succeed a lot of the time.
So instead of the obvious naive design, Apple stores enough secret data in an HSM so that you can't attempt a decryption without the involvement of the HSM. At the same time, the HSM enforces an attempt counter, preventing brute force attacks. To scale the design, Apple partitions customers into "clubs" of HSMs, with the attempt counter synchronized among the HSMs of the club using a distributed commit algorithm.
(Somewhat infamously, Ivan Krstic detailed how they protected the HSMs themselves from malicious attacks by putting their software update signing keys through a "physical hash function" called "Vitamix blender".)
What Signal is doing here is essentially what Apple did, but using SGX instead of an HSM, and RAFT as the consensus algorithm to synchronize the counters. You might reasonably prefer the Apple approach to SGX, but at the same time, the data that Signal is storing is a lot less sensitive than the data Apple stores.
Probably the biggest end-user takeaway from this announcement is that it's the start of a process where Signal is able to durably and securely store social graph information for its users (without revealing the social graphs directly to Signal itself, unlike virtually every other secure messaging system). Once they can do that, they'll have ended most of their dependence on phone numbers.
FWIW, there are a couple of things about Apple's and Google's systems that don't work for Signal:
1. There is no meaningful remote attestation. There's no way to verify that there are HSMs at the other side of the connection at all. The people who issued the certificates are the same people terminating the connections.
2. There's no real information about what these HSMs are or what they're running. Even if we trust that the admin cards have been put in a blender, we don't know what the other weak spots are.
3. The services themselves are not cross-platform, so cross-platform apps like Signal can't use them directly.
4. It's not clear how they do node/cluster replacement, and it seems possible that they require clients to retransmit secrets in that case, which is a potentially significant weakness if true. I could be wrong about this, but the fact that I have to speculate is kind of a problem in itself.
My impression is that you're suggesting the HSMs Apple uses are better than SGX in some way, but it's not clear that anyone could know one way or the other. I think all of the scrutiny SGX is receiving is ultimately a good thing: it helps shake out bugs and improve security. It's not clear to me that the HSMs Apple uses would actually fare better if scrutinized in the same way, which could be a missed security opportunity for them.
We didn't feel that it would be best for Signal to start with a system where we say "believe that we've set up some HSMs, believe this is the certificate for them, believe the data that is transmitted is stored in them." So we've started with something that we feel has somewhat more meaningful remote attestation, and hopefully now we can weave in other types of hardware security, or maybe even figure out some cross-platform way to weave in existing deployments like iCloud Keychain etc.
"My impression is that you're suggesting the HSMs Apple uses are better than SGX in some way, but it's not clear that anyone could know one way or the other. "
I predicted SGX would have more attacks simply due to it being widely available with more incentives. They started showing up. The HSM's get an obfuscation benefit on top of whatever actual security they have.
The main benefit of a good HSM, though, is its tamper-resistance. It takes a meeting of mutually-suspicious parties to know it was received, set up properly, the right code on it, and inability to do secret updates outside those meetings. From there, there's probably a greater chance that you didn't extract any secrets from it than an Intel box with who knows whatever SGX attacks, side channels, etc are going around.
My recommendation was combining several of them (i.e. security via diversity) if one could afford it. The systems in front of them should also have strong, endpoint security carefully sanitizing and monitoring the traffic. Think a security-focused design such as OpenBSD or INTEGRITY-178B instead of Linux. Safe, systems language for any new code. Good you're using some Rust.
Honestly, I'm just hedging against people who spend a lot of time thinking about SGX and have formed opinions about it. I don't have a strong opinion either way. My "take" here is just that the information you're protecting with SGX is information Wire "protects" with indexed plaintext in a database, and that SGX vs. HSM is not really a useful debate to have in this one case.
These days however you can do password resets manually with Apple. It is no longer as stringent as before where previously if you lost your password without recovery methods enabled your account is as good as gone. The current Apple account system is a lot weaker.
I'm very puzzled by the consensus group load balancing section. The article emphasizes correctness of the Raft algorithm was super important (to the point that they skipped clear optimizations!!11), but, then immediately follows up with (as far as I can tell) a load-balancer wrapper approach for rebalancing and scaling. My "this feels like consensus bug city" detectors immediately went off.
Consensus algorithms (including Raft and Paxos) are notoriously picky and hard to get right around cluster membership changes. If you try to end run around this by sharding to different clusters with a simple traffic director to choose which cluster, how does the traffic director achieve consensus with the clusters that the traffic is going to the right cluster? You haven't solved any consensus problem, you've just moved it to your load balancers.
A solution for this problem (to agree on which cluster the data is owned by) is 2-phase commit on top of the consensus clusters. It didn't appear from the diagrams that that's what they did here, so either I missed something, or this wouldn't pass a Jepsen test.
Did I miss something?
[If you did build 2PC on top of these consensus clusters, you'd have built a significant portion of Spanner's architecture inside of a secure enclave. That's hilarious.]
> These were hardscrabble people, living off of whatever meager storage they could scrounge together. They’d zip things, put them on zip drives, and hope for the best. Then one day almost everyone looked up towards the metaphorical sky and made a lot of compromises.
I wish every tech blog was written like this. Light-hearted and serious at the same time, almost like a work of fiction.
How exactly does SGX remote attestation work? From the linked document (https://software.intel.com/en-us/articles/innovative-technol...) it seems like it hashes execution state, but what's stopping the enclave from emulating the execution while also on the side performing some malicious operation?
Basically you're trusting the processor to truthfully attest its execution state. You can't emulate it because the attestation is signed with a key that's burned into the processor.
It is stored in e-fuses, which can be read, but it is encrypted using a Physical Unclonable Function, which means to read it you would need the CPU to be running. Very difficult but I'd be surprised if it were impossible.
Not necessarily a master key - it could use a PKI scheme, similar to how TPMs work.
>The solution first adopted by the TCG (TPM specification v1.1) required a trusted third-party, namely a privacy certificate authority (privacy CA). Each TPM has an embedded RSA key pair called an Endorsement Key (EK) which the privacy CA is assumed to know. In order to attest the TPM generates a second RSA key pair called an Attestation Identity Key (AIK). It sends the public AIK, signed by EK, to the privacy CA who checks its validity and issues a certificate for the AIK. (For this to work, either a) the privacy CA must know the TPM's public EK a priori, or b) the TPM's manufacturer must have provided an endorsement certificate.) The host/TPM is now able to authenticate itself with respect to the certificate. This approach permits two possibilities to detecting rogue TPMs: firstly the privacy CA should maintain a list of TPMs identified by their EK known to be rogue and reject requests from them, secondly if a privacy CA receives too many requests from a particular TPM it may reject them and blacklist the TPMs EK. The number of permitted requests should be subject to a risk management exercise. This solution is problematic since the privacy CA must take part in every transaction and thus must provide high availability whilst remaining secure. Furthermore, privacy requirements may be violated if the privacy CA and verifier collude. Although the latter issue can probably be resolved using blind signatures, the first remains.
Yeah it seems like you're right. I was assuming it was the same so that clients could verify SGX enclaves using a stored copy of Intel's public key.
However as far as I can tell they actually have a unique key per CPU, and they store a database of them which you have to query over the internet to verify an enclave.
It has the downside of requiring a network request to Intel to verify the enclave, but it does mean that there isn't a master key to leak.
I don't get what prevents an attacker from deleting the secrets of everyone by just guessing? Shouldn't the guesses be time limited instead (e.g. once per hour)? Even then you could easily bring the service down...
If someone attempted that large an attack, it would be pretty visible, and perhaps there's a mechanism to reset those tries after an amount of time. Targeted attacks on the other hand...
This is inherently good in itself. But, I ask myself if the oft- requested 'can we be people without phone # in signal' is now actually a deliverable, or if they only stated it in hypothesis, and still don't have that as a roadmap outcome?
I want to have two (or more) non-phone enabled devices able to be in signal. My tablet, and my computer. I realize there are adjunct methods, but depending on a physically present device to have one thing hooked up isn't actually what we want here, the phone is not a useful second factor, its a hack I believe they worked out to get beyond the 'must be a phone' state without having to re-engineer the back end.
So: do we now get phone-less signal identity? This feels like a precursor. Does that definitionally say phone-less identity will follow?
(again only from belief, I believe the secure enclave on the phone is bound into identity along with the IDD, so having a secure enclave backed in the cloud breaks one of the two dependencies out a bit)
I don't think it would make sense for them to now state that they're going to deliver that no matter what, because then
1. You're going to have people asking for timelines on that even more often than people are currently asking for non-phone number authentication, and if you indulge them, they're going to get angry with you for missing it.
2. Who knows what other problems might turn up when implementing it.
I'd expect them to announce it very close to it being actually possible. Until then, we'll probably only hear about individual problems they solved on the road to there.
It's essentially a way of storing your password hash in the cloud such that nobody - not even the cloud providers - can read the hash and try to brute force it. All anyone can do is send password attempts to the cloud server's SGX system, which in theory is completely private, even to the host OS.
It also provides a way for the client to verify the code that is running in the SGX system, so you know you're sending your password attempts to some program that really does do all this fancy stuff. You don't have to take Signal's word for it.
It's basically equivalent to the chip in iPhones that stores your PIN and counts failed attempts. Except it's in the cloud and distributed, which is way harder to do.
Not really. The goal is to safely store an extra random value that's mixed with the password hash to derive the master key for the account, because they don't want to fully trust the password hash, because some passwords are too weak.
Could've just made password requirements stronger, but that doesn't provide an excuse to play with SGX I guess :)
It's an attempt at storage of secrets on a remote server using SGX and consensus protocols. Signal seems to be using this to store people's social graphs "in the cloud".
4. Send key-pair (token, split2) to remote machines running BOLTed [3] server code in a SGX enclave [4] that replicate it using raft-consensus protocol [5] over Noise [6] in an hardware-encrypted RAM [7] and never on-disk.
6. Generate as many application-keys as required, and use these to encrypt user's data, for instance, app-key-sge = hmac-sha256(master_key, "social graph encryption") and so on...
-
Recovery of the secret (by the client):
1. On the client, using user-provided pass-phrase, generate sk, token, and split1, like above.
2. Send the token to a remote machine. If the token is valid, remote lets the client retrieve split2 [8] with which the client can now generate the master-key and app-keys, as before.
3. If the token sent is invalid, remote lets the client retry a very limited number of times before destroying the secret key-pair.
---
[0] From the post, ...we’ve been working on new techniques based on secure enclaves and key splitting that are designed to enhance and expand general capabilities for private cloud storage. Our aim is to unlock new possibilities and new functionality within Signal which require cross-platform long-term durable state, while verifiably keeping this state inaccessible to everyone but the user who created it.
[1] https://en.wikipedia.org/wiki/Argon2 (end-users prefer shorter passwords but one gets better entropy with longer passwords... which is what Argon2 brings to the table)
I still don’t know how we are going to be able to validate the remote attestation comes from the enclave and not, say, a virtualized one that just logs all the secrets but still attests correctly. Are they going to ship intel device hardware pubkeys or certs to the clients?
>I still don’t know how we are going to be able to validate the remote attestation comes from the enclave and not, say, a virtualized one that just logs all the secrets but still attests correctly
by checking that the attestation is signed with the Intel key and not some self signed key?
All of my signal desktop clients seem to update automatically, downloading new code and prompting me to restart the app to run it.
With such opaque, basically-RCE privileges assumed by the app, I am not sure what would stop a malicious insider from sending an update to silently replace the sgx keys or certs (or shipping any other targeted secret exfiltration code).
Seems to me that with the client autoupdate, this would be a much easier nut to crack than the secure enclave (not to disparage the work on the defense-in-depth). It just seems like a ton of work and complexity to me for not a ton of benefit considering the existing threat models.
This is a super interesting problem! It applies to basically all desktop autoupdaters. Note that App Stores are in a better place -- as far as I know they don't make it possible to target users with malicious updates, you'd have to push the malicious update to everyone at once.
Some anonymity wins for desktop apps are downloading updates over Tor (to prevent the update server knowing who you are by your IP) or Bittorrent (to prevent the ability to offer an update to someone without offering it to everyone), and checking that you're being offered a binary whose hash is published publicly (e.g. to a blockchain). But then you start hitting increased code surface problems and those systems not working on some networks.
I think Tor's Firefox browser does the anonymous Tor update thing; that's probably the best in class here.
Apple's desktop OS updaters have historically had the ability, serverside, to target specific serial numbers or MAC addresses to receive certain firmware updates. I would be surprised if they don't have similar functionality on the iOS side of things.
I mean, we're talking about engineers from two different companies at this point, right? Presumably the Foo App engineers have to prepare and sign a malware version of Foo and then you have to go coerce a bunch of Apple engineers to send the iOS update to specific phones.
Once your conspiracy theory involves multiple uncoordinated groups, the chance of it remaining secret basically goes to zero. Once you stop holding people at gunpoint they're going to tell someone what you did. Once it's known that Apple targeted malware at some of its users, its reputation will crash. So Apple would probably resist this as strongly as it can, which is probably pretty strongly given that it's the most valuable public company in the entire world.
Am I missing something? This is not sounding like a plausible conspiracy theory so far.
It's not a conspiracy theory if the proposed scenario is the DOJ says to Apple in a lawful wiretap order "you must push this wiretapped messaging app update to phone identified by IMEI 12345". Wiretaps and stored comms records subpoenas are commonplace things, and the DOJ is entirely capable of contracting someone to quickly produce a specially-modified wiretap version of any app available on the app store (it's not a terribly difficult technical hurdle), even the ones that aren't open source.
Whether or not Apple would cooperate by providing the appropriate app signature and shipping it to the phone or not is another question. There is of course the legal argument that creating a cryptographic signature against their will is compelled speech and thus they have the right to fight or refuse. Personally, I doubt they'd fight very hard, provided the whole thing were kept secret (which would be in the interest of both Apple and the DOJ). Apple knows quite well that you can't fight city hall (or the CCP). The wiretapped iCloud servers in China are a perfect example of Apple's reputation remaining intact despite shipping a nationwide backdoor for a government to millions of their iPhone and iPad customers.
After all, they'd always have the standard "we received a lawful wiretap order, signed by a federal judge, appropriately limited in scope to a single suspect, with which we were legally required to comply" fallback.
Jeez, I guess so. We're pretty far off track: when I wrote that "[app stores] don't make it possible to target users with malicious updates" up there, what I was talking about was that the app stores aren't giving Foo App's developers the power to push individualized malware to their users, which is a power that Foo App has through its own desktop autoupdater system.
App stores raise the bar from "Foo App decides to own you" to "The DoJ is able to convince a company with the legal firepower of Apple to sign and deliver a backdoored version of someone else's app".
I suppose so, but I am not keeping my conversations secret from the nice folks at OWS, I am keeping them secret from the federal government who likes to lock people up on bogus trumped-up charges for criticizing them or their allies in public; see weev or Roger Ver or Assange for examples.
The DoJ threat model is much more likely and much more dangerous than a rogue insider at the app dev shop pushing an update.
Especially since virtually every other secure messaging system simply stores the information Signal is discussing here in plaintext, indexed in SQL databases running in their cloud environments.
Is this going to make app backup/restore even more torturous?
I already almost got bit by the transition to the current magic number + special in-app export, vs. the previously-working Titanium Backup APK + data snapshot method.
It's not clear to me how the nodes authenticate a new node on node replacement. They say that the nodes check the new node's MRENCLAVE value, so what happens if there is a software update and the MRENCLAVE value has to change? How do the old nodes know what the new MRENCLAVE value should be?
Why does Signal want to make things so complicated?
This starting premise about the 'normal approach' is not true:
> However, you may want to change devices, and accidents sometimes happen. The normal approach to these situations would be to store data remotely in an unencrypted database,
No, the normal approach would be to let authenticated users export their data from their own secure device to a place of their choosing, then let authenticated users also import that data.
For paternalistic control-freaks like the Signal team, the data could even only ever be exported in an encrypted format, for encryption-at-rest. Sure, there'd be some risk that the encryption key is not well-protected, or the data-at-rest is subject to brute-force attacks – but many users can manage those risks themselves.
So why this strawman premise that the baseline is "remote" and "unencrypted"? Just give me a local export, and import, and I'll protect my data fairly well, thank you.
That would solve a major usability disaster of Signal on iOS devices: that even an orderly, planned device upgrade where both devices are in your sole control – and could conceivably do a direct transfer of all sensitive data! – will still lose all your history and Signal-contacts.
The cloud-centric strawmen from Signal continue with this related claim:
> In the example of a non-phone-number-based addressing system, cloud storage is necessary for recovering the social graph that would otherwise be lost with a device switch or app reinstall.
No, again, all a user needs is a local backup/transfer method for that list of usernames/identity-endpoints. (And to be no worse than the current Signal approach of re-using the device's native contact list, this list-at-rest or list-in-transit only needs protection as good as the native contact list, a pretty low bar.)
They don't really explain any specific problems with what they've slurred as a "zip drive" approach.
Offering a self-backup option would be more competitive than Signal's current nothing at all – which means total data loss on any device loss or upgrade, for iOS users.
Absolutely, promise users a Signal-quality cloud experience, someday, when this novel development is finished, and when there's an economic model for reliably storing users' opaque data in the cloud – and if users trust Intel SGX.
But in the indefinitely-long meantime, why deny users the proven & well-worn path of self-data management? (And shouldn't users have the right to an exportable format of their own contact/messaging data?)
The manual backup & restore flow on Android is quite fragile - I filed two separate bugs during the process. It kinda seems like most people never get it working, which isn't surprising when you're dealing with two phone OSs interacting with a desktop OS (plus a user who is probably unfamiliar with directly accessing their phone storage over USB). The troubleshooting costs are quite high for such a rarely-used feature when their main competition (WhatsApp) has seamless backup restore. Investing time to develop the manual backup & restore feature on iOS might not be worth it.
That's useful info on the Android experience... but if it works for anyone, it's still better than Signal's crappy answer to iOS users: "wait until someday when you can trust our new crypto & Intel SGX to upload all your data to a cloud".
And yes, WhatsApp has a seamless experience – even if you never share your contact lists, or back-up your message history, to WhatsApp. It "just works" even if you only use encrypted local iOS phone backup/restores – no "unencrypted database in the remote cloud" required. That's why I find Signal's rationale, that "unencrypted remote cloud" is either the "normal approach" or "necessary for recovering the social graph", to be a hand-wavy & borderline dishonest oversimplification. There's a simple approach that works pretty well, including in their major competitor, without the flaws they allege!
Debugging an "export to a single file whose format is completely under our control", then "import from that single file" cycle isn't exactly rocket science.
But Signal clearly sees their desire to do rocket science as a sufficient excuse for why iOS users (& perhaps per your report, many Android users) can't have any working backup/device-migration solution for 5 years, and counting.
iCloud Keychain synchronizes keychains across iOS devices, storing their contents encrypted under the user's passphrase on Apple's cloud servers. In theory, Apple has no access to this data, since they don't know the relevant passphrase. In practice, however, passphrases are weak, even under PBKDF2. An attacker that got access to Apple's cloud environment would simply dictionary attack the encrypted blobs, and would probably succeed a lot of the time.
So instead of the obvious naive design, Apple stores enough secret data in an HSM so that you can't attempt a decryption without the involvement of the HSM. At the same time, the HSM enforces an attempt counter, preventing brute force attacks. To scale the design, Apple partitions customers into "clubs" of HSMs, with the attempt counter synchronized among the HSMs of the club using a distributed commit algorithm.
(Somewhat infamously, Ivan Krstic detailed how they protected the HSMs themselves from malicious attacks by putting their software update signing keys through a "physical hash function" called "Vitamix blender".)
What Signal is doing here is essentially what Apple did, but using SGX instead of an HSM, and RAFT as the consensus algorithm to synchronize the counters. You might reasonably prefer the Apple approach to SGX, but at the same time, the data that Signal is storing is a lot less sensitive than the data Apple stores.
Probably the biggest end-user takeaway from this announcement is that it's the start of a process where Signal is able to durably and securely store social graph information for its users (without revealing the social graphs directly to Signal itself, unlike virtually every other secure messaging system). Once they can do that, they'll have ended most of their dependence on phone numbers.