During the past month, I've been testing a new backup strategy and system for the FSFE which is built around Duplicity and a remote storage accessible via SSH. While we've gone through some of the attack vectors to see if this makes sense, I would appreciate more eyes on it. Our previous backup system was significantly simpler in a way, consisting of a simple computer with massive amounts of disk, talking directly via ssh to the client machines and running rsnapshot.
The reason we at all set out to configure a new backup system is that our previous one would not be available for much longer, and given our need to store terabytes of backup data, we could not actually (for a reasonable cost) add the needed backup space to one of our current hosts. So instead I decided to investigate the option of separating the storage from the backup controller, with the idea that if the storage is encrypted, we would be more flexible in where and how it is stored, as access would not (only) depend on physical security.
There are three agents of this new system:
- Our backup server (BACKUP)
- Our client machine which is supposed to be backed up (CLIENT)
- A remote storage accessible via SSH and having ~10TB storage (REMOTE)
The flow of our system at the moment is basically as follows:
- BACKUP controls and initiates the backups by pulling data from CLIENT,
- CLIENT allow BACKUP access to its file system in read-only mode via sshf (this access is enforced by an ssh key specifically starting command="/usr/lib/openssh/sftp-server -R"),
- BACKUP runs duplicity and encrypts backups with a specific (non published) GnuPG key where the secret key passphrase is not stored on BACKUP itself but only kept by out system administrators,
- Duplicity on BACKUP sends the backup archives to REMOTE
Here are the attack vectors I've currently considered in this:
- If CLIENT is compromised, the attacker does not get any access to the backups as there's no authentication in that direction.
- If BACKUP is compromised, the attacker can remove backups (since BACKUP has access to REMOTE), but is not able to affect CLIENT (as it's a read-only permission).
- If BACKUP is compromised, an attacker can not read backups as they are encrypted.
- If REMOTE is compromised, the attacker gets encrypted containers for which they don't have the key for.
Of course, there are issues with this. Given the controlling function of BACKUP, an attacker could for instance try to sneak under the radar and replace or destroy files in transit between BACKUP and REMOTE, ultimately rendering our backup archives invalid. If BACKUP is compromised, even with just read-only permission, it may be possible to read enough data from our servers to find authentication keys or similar which provides a way to escalate the permissions.
Those drawbacks are likely to be the same regardless of what strategy we employ though. One way around them would be that the client machine encrypts data before sending it to the backup server, but this places a stronger emphasis on having local software installed on the client machine, which we try to avoid in favor of a simple ssh connection.