This week, it finally happened. I think it’s the first time in 20 years that a hard drive has died on me without warning. And it was also the first time I was using an NVMe drive, but that could be a coincidence.

The drive was still under warranty (barely a year and a half old). I even had a spare lying around. But the true cost of restoration is, of course, my own labor. My planning had not been perfect (for such a remote event, as I had judged). However, it was easy enough. I simply installed NixOS from a USB loader and downloaded my configuration from my backup on my NAS (daily rsync jobs to the rescue). I also downloaded all the important files for my home directory. Then, it was simply a matter of adjusting a few things in the configuration file, rebuilding the system, and voilà. Well, except for a few things that didn’t work quite right for some reason and had to be manually fixed, but nothing major.

However, next time I want this to be even easier. It’s probably overkill to install a RAID controller and have multiple drives running in RAID1 or RAID5, but the restoration process is still too much manual work. I was thinking of regularly backing up my main drive on the block device level, so I would just have to swap out the drive and restore the delta from the backup. I’m not quite sure if that’s feasible or a good idea. For my personal system, I have to balance the investment of preparing for a disaster with the likelihood and impact of such an event. This seems like a good trade-off, but I would be curious to hear how other people prepare for drive failure.

  • HelloRoot@lemy.lol
    link
    fedilink
    English
    arrow-up
    16
    ·
    edit-2
    3 days ago

    I have successfully recovered from dead drives by restoring from a borgbackup to a fresh new drive.

    Borg backups take much less space on the backup storage because of extremely efficient compression and deduplication.

    The Professor who developed it has some presentations on youtube and it’s kind of mindblowing.

    So thats what I would recommend.

    I backup all my computers and servers with borgmatic which makes it a bit easier to manage excluding directories and how many versions you’d like to keep.

    If you need any help with setting it up, let me know.

    • julian_hoch@lemmy.mlOP
      link
      fedilink
      arrow-up
      4
      ·
      3 days ago

      Thanks, that looks interesting! I wonder how that compares to something like btrfs snapshots. How easy is it to restore a whole disk as opposed to files and directories?

      • HelloRoot@lemy.lol
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        3 days ago

        I have btrfs snapshots with snapper on my desktop. It keeps the last 20 snapshots. Sending them to a second drive would require an equal amount of space as the main drive, which is ~850GB / 1T full.

        But the borg backup for the same takes only ~450GB and also keeps the last 20 versions. Because of the smaller size, sending the backup over the network is also quicker than with btrfs.

        So I use btrfs to restore situations about filechanges (for example a bad system update).

        Borg is easier to set up a central server for all my devices, because it takes much less space. I run https://github.com/Ravinou/borgwarehouse . So I use that in case where the drive fails. To restore I set up the same partition layout as before and then throw the borg backup at it. It was easy enough so far.