How many backups do you need

There is a very easy an simple 3-2-1 backup rule

  • 3: Each file should exist 3 times (original + 2 backups)
  • 2: Use 2 different types of storage systems (e.g. internal disk + tape / cloud / ...)
  • 1: Have one copy offside (to survive e.g. natural disasters, issues with cloud / infrastructure provider etc)

Examples

Where your data isWhere first backup isWhere second backup isComment
Internal diskInternal diskAmazon S3 (region with enough distance to your server) 
Amazon S3Amazon S3 (other region)Azure Storage (other region)At last one of the backups needs to in a different region

Cloud accounts for backups

AWS

  • In the Azure portal search for S3, create bucket, disable public access, remember bucket name
  • It might make sense to enable "Bucket Versioning" in the bucket properties. Deleted files are then only hidden and if you modify a file, the old content is still available. If you do not enable this, automated syncs might overwrite good backups with bad local files
  • Also add a "Lifecylce rule" under bucket management to ensure deleted files are at some point really deleted, half uploaded files cleared, old versions of files at some point cleared etc.
  • Search for IAM, create group (e.g. backupusers), add the group the permission "AmazonS3FullAccess", create a user (good to have multiple users for different things you want to backup), add user to the group, select the new user, security credentials, create access key. Remember key and secret

Tools

Rclone

Rclone is an amazing tool to copy files between different filesystems.

apt-get install rclone

Rclone configuration

Rclone comes with a build in configuration assistant. Just do this and answer the questions about the storage you want to add

rclone config
  n) New remote
     Amazon S3 Compliant Storage Providers including AWS, ...
       Amazon Web Services (AWS) S3
         Enter AWS credentials in the next step.
           ...
This tells you where the generated configuration is
rclone config file

Rclone commands

If you added an S3 storage with the bucket name com.example.foo and you named it bar that you can do this

rclone ls                                                                      bar:/com.example.foo/
rclone copy    --retries int 1 --copy-links                      /tmp/pictures bar:/com.example.foo/pictures
rclone sync    --retries int 1 --copy-links                      /tmp/pictures bar:/com.example.foo/pictures
rclone sync -P --retries int 1 --copy-links --max-backlog 100000 /tmp/pictures bar:/com.example.foo/pictures

Where sync will also delete files in the destination folder that are not in the source folder. When you want to follow the progress (-P) having a high number for max-backlog is good to see how long to sync will actually take.

Rclone encryption

After you created you first Rclone storage you can add a second one that encrypts the file content, file name and folder name before sending it to the first one. As long as you have the Rclone configuration with the password this is totally transparent for you, just read and write from the second storage. The advantage is that the data can not be read by the cloud provider, the disadvantage is that you will always need rclone to read the data

rclone config
  n) New remote
     Encrypt/Decrypt a remote (crypt)
      remote> bar:/com.example.foo/