I have a problem. I run backups weekly, but the process is pretty manual, time-consuming, and complex. Let me show you how I got here.
I follow the 3-2-1 backup rule. At least 3 copies of your data, on at least 2 kinds of media, with at least one being offsite. Easy. I bought a 1TB SSD and set up Time Machine to back up to it and run it once a week. Then, I have Backblaze run in the background for a cloud backup. So far so good.
The problem is twofold: I have data on iCloud that I can’t keep on my laptop for space reasons (my laptop is a mere 256GB, but I have 150ish GB on iCloud Drive, plus photos). Simple solution: get another 1TB SSD as a backup for iCloud, and then upload that stuff to Backblaze B2. And that’s what I did. Temporarily, I deleted a whole bunch of stuff from my laptop to make space, got iCloud Drive to sync, copied everything to the SSD, and Backblaze B2 did the rest. But there were two problems: (a) the sync didn’t complete, apparently, so Backblaze and the SSD now have .icloud
files everywhere instead of the real files (b) At some point B2 had weird issues where it would say B2 had newer files than the SSD–I used the skipNewer
option to get around this, but sometimes it just sits there printing nothing and I have no clue if something is happening.
So today I repaired my backup. Delete all of the iCloud stuff from B2. Request data from Apple using privacy.apple.com. Spin up a VM on Google Cloud with more than enough disk space, download the files there (did I mention this means clicking download on my Mac, then grabbing the download link in Firefox, and canceling it, so that I can use curl
to download on the VM?), and extract them. These zip files contain more zip files inside, so you now extract those too. And now your data is in half a dozen folders, so you use cp to merge them together. Finally, use B2 to upload to Backblaze’s servers. This entire goddamn process of repairing my backup took about 6 hours. As to future backups, any time I move anything to iCloud Drive, I also turn on the VM, copy stuff there, run b2 sync, and then turn it off (although: copying the files from the VM to the second SSD and then running it from there would also work). Did I mention I buy my music, so I want a backup of that in B2 as well?
And now for the photos. This one is easy: on the second SSD, I also have a folder to download iCloud Photos to. But iCloud doesn’t let you easily export photos–so I use icloudpd to do the job.
So come Saturday, I do the following:
- Plug in the first SSD so Time Machine can do its thing.
- Plug in the second SSD.
- First, copy over any updates to iCloud Drive to this SSD. Run B2.
- Next, run the icloudpd tool. Sync that to B2.
- Finally, if I’ve purchased any music in the past week, copy that file over and sync to B2.
Yes, this can be vastly simplified…if I get a laptop with more storage and maybe a NAS, and get rid of iCloud altogether. But that involves a significant amount of money (although, that is the end goal).
Dear Reader, since you made it this far, here’s my gift: a set of commands that make this a little bit easier for me.
- Getting all the music you bought from iTunes: Go to ~/Music/iTunes/iTunes Media/Music and run
for file in **/**/*.{mp3,m4a}; do echo $file; done | sort | uniq
- Downloading iCloud Photos using icloudpd:
icloudpd -d . --recent 150 # or as many pictures you think you took
- Extracting files from what Apple will give you:
import os
for file in list.str`ls *.zip`:
`unzip {file} && rm {file}`
for dir in list.str`ls -d */`:
os.chdir(dir)
for file in list.str`ls *.zip`:
`unzip {file} && rm {file}`
os.chdir('../')
This code is written in pysh, a superset of Python for which I wrote a transpiler. Once this finishes, it’s just a matter of merging the directories together.