The Backup Post Ate My ISO Machine — Part 13 of Building a Resilient Home Server Series

— Part 13 of Building a Resilient Home Server Series

Where We Left Off

Part 12 ended on a note of cautious optimism — Windows machine redone, Syncthing certs dropped back in, everything reconnected like nothing happened. The backup paid for itself before the post even went up, which felt like a win. Still on the to-do list though: build a new ISO.

That got punted because the machine dying is what caused the reinstall in the first place. I have a day job. The priority was getting what absolutely had to be running back up and moving on — not fighting with VirtualBox or Hyper-V on a fresh Windows install to try to build an ISO that wasn't strictly necessary to function. The previous setup had VirtualBox for the VM, then I'd switched to Hyper-V mid-troubleshooting to rule it out as a factor. Spoiler: wasn't VirtualBox. Machine died anyway. Fresh install, Visual Studio back on it, and the question of what to do about the VM situation.

The answer was not "reinstall Hyper-V." The server has a 6TB SSD with plenty of headroom and Part 12 already sorted out Samba for Time Machine — the infrastructure is sitting there. If the VM lives on nixos2, the finished ISO lands directly on a network share — no manual copying, no "which machine did I leave that on" archaeology. The workflow is the same as before (spin up VM, port config in, build, push to GitHub (UPDATE: Codeberg. See future post for more info or click here for why)) just less convoluted at the end, and the dev machine doesn't get a hypervisor bolted back onto it for something I can just run on the server.

This turned out to be the right call. Let's get into it.

The Plan

The VM runs the vm branch of nixos-config — same config as the real servers, just with PasswordAuthentication left on for initial setup and none of the server-specific services enabled. Persistent, not ephemeral, because builds are occasional and I want to be able to SSH back in later without starting from scratch.

The piece that makes the ISO distribution story clean is virtiofs. The host exposes a directory to the VM as a filesystem — no network involved, no daemon syncing in the background, no delay. The VM writes to /mnt/host-isos/ and the file shows up immediately on nixos2 at /mnt/nextcloud-data/isos/, which is already on the Samba share at \nixos2\isos. ISO finishes building, it's on the network. That's it.

nixos2-config: `modules/vm.nix`

Everything VM-related lives in a new vm.nix module. Wiring it in is one line in configuration.nix:

imports = [
  /etc/nixos/hardware-configuration.nix
  /etc/nixos/modules/networking.nix
  /etc/nixos/modules/services.nix
  /etc/nixos/modules/monitoring.nix
  /etc/nixos/modules/system.nix
  /etc/nixos/modules/boot-uefi.nix
  /etc/nixos/modules/timemachine.nix
  /etc/nixos/modules/vm.nix          # <- this
  <home-manager/nixos>
];

The module covers libvirtd/QEMU config, a virtiofsd systemd service that exposes the isos directory to the VM, storage directories on the SSD, and two systemd oneshots for setup that would otherwise require manual steps after every fresh enable.

One of those oneshots is worth explaining. libvirt's default NAT network isn't automatically defined on NixOS — on a stock setup you'd have to run virsh net-define by hand after enabling the VM host. I added a systemd oneshot that checks whether the network already exists before touching anything, so a rebuild doesn't try to create it twice and error out. Same pattern for the storage pool. Nothing to remember to run manually after a rebuild.

Adding the ISO share also prompted a tidy-up of how Samba is structured across the config files. With more than one share now, it made sense to split things properly — services.nix owns the full Samba config including the global settings and the isos share, and timemachine.nix holds only what's specific to Time Machine (the share definition, tmuser, and the directory). NixOS merges the settings across modules so both files contribute to the same running Samba instance.

services.nix now has the global config and isos share:

services.samba = {
  enable = true;
  settings = {
    global = {
      "workgroup"    = "WORKGROUP";
      "server string" = "nixos2";
      "security"     = "user";
      "server role"  = "standalone server";
      "fruit:metadata" = "stream";
      "fruit:model"  = "MacSamba";
      "fruit:posix_rename" = "yes";
      "fruit:veto_appledouble" = "no";
      "fruit:wipe_intentionally_left_blank_rfork" = "yes";
      "fruit:delete_empty_adfiles" = "yes";
    };
    isos = {
      "path"           = "/mnt/nextcloud-data/isos";
      "comment"        = "NixOS ISO builds";
      "browseable"     = "yes";
      "writable"       = "yes";
      "valid users"    = "ppb1701";
      "create mask"    = "0644";
      "directory mask" = "0755";
      "vfs objects"    = "catia";
    };
  };
};
services.samba-wsdd.enable = true;

And timemachine.nix just adds its share:

services.samba = {
  enable = true;
  settings = {
    timemachine = {
      path = "/mnt/nextcloud-data/timemachine";
      browseable = "yes";
      writable = "yes";
      "valid users" = "tmuser";
      "vfs objects" = "catia fruit streams_xattr";
      "fruit:time machine" = "yes";
      "fruit:time machine max size" = "2000G";
    };
  };
};

Verified from Windows: \nixos2\isos shows up in Explorer, ISO visible. That was the last step and it felt disproportionately satisfying.

Monitoring: SSD Alert Rules

The SSD is doing more work now — Time Machine, ISOs, Nextcloud data. Added two Prometheus alert rules in modules/monitoring.nix to catch space issues before they become a problem:

- alert: SSDSpaceWarning
  expr: (node_filesystem_avail_bytes{mountpoint="/mnt/nextcloud-data"} / node_filesystem_size_bytes{mountpoint="/mnt/nextcloud-data"}) * 100 < 20
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Low space on SSD (nextcloud-data)"
    description: "SSD at /mnt/nextcloud-data has less than 20% free."

- alert: SSDSpaceCritical
  expr: (node_filesystem_avail_bytes{mountpoint="/mnt/nextcloud-data"} / node_filesystem_size_bytes{mountpoint="/mnt/nextcloud-data"}) * 100 < 10
  for: 2m
  labels:
    severity: critical
  annotations:
    summary: "Critical space on SSD (nextcloud-data)"
    description: "SSD at /mnt/nextcloud-data has less than 10% free."

Neither fired. This is the correct outcome.

nixos-config: vm Branch Changes

`iso-config.nix`

One line — home-manager switched from release-25.05 to master to match the unstable nixpkgs the VM is running. Mismatched channels were the whole cascade from Part 12. Not doing that again.

`modules/system.nix`: The virtiofs Mount

This is the piece that didn't exist in the old VirtualBox workflow. Instead of building the ISO and then figuring out where to put it, the VM mounts a directory directly from the host:

fileSystems."/mnt/host-isos" = {
  device  = "host-isos";
  fsType  = "virtiofs";
  options = [ "nofail" ];
};

boot.kernelModules = [ "virtiofs" ];

"host-isos" in device matches the <target dir="host-isos"/> in the VM XML. That string is what ties the guest-side mount to the host-side virtiofsd socket. If they don't match, the mount fails silently and you spend time wondering why the directory is always empty.

The nofail is not optional. I found this out the hard way. Without it, if the VM boots before virtiofsd has its socket ready — which happens on a cold nixos2 boot — the VM drops to emergency.target. With nofail, the mount retries when it can and the boot continues normally. One word. Add it.

`build-iso.sh`

Updated to auto-detect whether it's running inside the VM by checking if /mnt/host-isos is actually mounted. If it is, the ISO goes there directly. If not (running on a physical machine), it falls back to printing the manual copy instructions:

if mountpoint -q /mnt/host-isos 2>/dev/null; then
    DEST="/mnt/host-isos/nixos-config-$(date +%Y%m%d-%H%M).iso"
    cp "$ISO_PATH" "$DEST"
    echo "Available via Samba:  \\nixos2\isos\$(basename $DEST)"
else
    echo "Copy to Ventoy USB with:"
    echo "sudo cp $ISO_PATH /path/to/ventoy/"
fi

The timestamp in the filename means old builds don't get clobbered. Handy when you want the last known-good ISO without having to check whether the one on the share is this build or the previous one.

The VM XML: Two Required Additions

After creating the VM in virt-manager, virtiofs needs two additions to the domain XML. Edit it with virsh dumpxml iso-builder > /tmp/iso-builder.xml, make the changes, then virsh define /tmp/iso-builder.xml.

Add <memoryBacking> before <devices>:

<memoryBacking>
  <source type='memfd'/>
  <access mode='shared'/>
</memoryBacking>

virtiofs requires shared memory access. Without this block, starting the VM gives error: Unable to find a satisfying virtiofsd. The error message does not helpfully point you at memoryBacking as the cause. You're welcome.

And add <filesystem> inside <devices>:

<filesystem type='mount' accessmode='passthrough'>
  <driver type='virtiofs'/>
  <source dir='/mnt/nextcloud-data/isos'/>
  <target dir='host-isos'/>
</filesystem>

target dir is the string the guest uses as the device name in fileSystems. Both sides need the same string. host-isos here, host-isos there.

Things That Went Wrong

Fun with unstable: services.samba.extraConfig was removed. The global Samba settings that used to live in a freeform string block now need to be in services.samba.settings.global as proper key-value pairs. Same deal with securityType = "user" as a top-level option — that's gone too, security = "user" now lives inside the global settings block. Neither of these has any effect on how Samba actually behaves, it's just a config format migration. The error message at least tells you exactly what to fix.

One that only showed up after the fact: configuration-uefi.nix was importing boot-bios.nix instead of boot-uefi.nix. Copy/paste error at some point — the imports list got updated but grabbed the wrong boot module. Never flagged before because the old dev machine setup needed BIOS anyway, so it never caused a problem. Caught it when a rebuild inside the VM failed trying to install GRUB to /dev/sda which doesn't exist. One character difference in a filename, easy to miss. OVMF is always available now, the option was just removed. Delete it from vm.nix and move on.

The VM booted in BIOS mode instead of UEFI on the first try — default virt-manager firmware selection. The real server config is UEFI, so a BIOS VM would produce an ISO that boots wrong on the target hardware. Delete the VM, recreate with UEFI/OVMF explicitly selected. Easy to miss during initial setup.

AdGuard conflicting with libvirt's dnsmasq was a fun one. AdGuard was bound to 0.0.0.0:53, which means it grabs every interface on the machine — including virbr0, the virtual bridge libvirt creates for its NAT network. libvirt's own dnsmasq instance needs port 53 on that interface for VM DHCP and DNS. They can't share it. Result: VMs get no DNS and you spend a while wondering why. The fix is to restrict AdGuard to specific interfaces:

bind_hosts = [ "YOUR_LAN_IP" "127.0.0.1" "YOUR_TAILSCALE_IP" ];

This is easy to miss because AdGuard works fine on a machine that isn't running VMs. The conflict only shows up when libvirt tries to start its network. Added a comment in services.nix flagging this so it's not a mystery on the next rebuild.

The VM dropped to emergency.target after the first rebuild inside the VM. That was the missing nofail on the virtiofs mount, already covered above — listed here too because it ate close to an hour and deserves to be findable.

The VM couldn't reach git.home or anything on 192.168.50.x. The VM is on libvirt's NAT subnet (192.168.122.x), not bridged to the LAN. nixos2 is on WiFi anyway so bridged networking isn't really an option — outbound internet through NAT works fine, but internal LAN hostnames and private IPs aren't directly reachable from inside the VM. The workaround is Tailscale: 100.x.x.x addresses route through the mesh regardless of which physical network you're on. Cloned nixos-config over Tailscale, checked out the vm branch, rebuilt. Worked fine.

How It Went

nixos2 rebuilt with vm.nix imported. libvirtd came up, virtiofsd came up, the isos Samba share came up. Created the VM in virt-manager with UEFI firmware, installed NixOS via console, VM booted. SSHed in on port 2212 — worked first try, which I will take. Cloned nixos-config over Tailscale, checked out the vm branch, rebuilt. Hit the virtiofs emergency.target issue, added nofail, rebuilt again, clean boot. Updated the VM XML with <memoryBacking> and <filesystem>. Verified: mount | grep host-isos shows host-isos on /mnt/host-isos type virtiofs (rw,relatime). Wrote a test file to /mnt/host-isos/, checked nixos2 — visible at /mnt/nextcloud-data/isos/ immediately. Ran build-iso.sh. ISO landed on \nixos2\isos. Verified from Windows.

That's the whole thing, more or less. It worked.

What's Still Left

The disabled stubs are already ported to nixos (primary). VM-SETUP.md is updated, everything is committed to both repos, and there's a fresh ISO sitting on GitHub (UPDATE: Codeberg).

Lessons Learned

When you have a server with spare capacity, use it. The dev machine doesn't need a hypervisor on it. The server is there, it has room, and running the VM on the server makes the whole ISO distribution story significantly less annoying.
virtiofs is the right tool for large, one-directional files. The ISO is on the network share the moment the copy finishes — no sync window, no manual transfer, no wondering whether it got there. For files that only move in one direction and are large enough that you don't want a sync daemon involved, sharing a directory via virtiofs is exactly right.
nofail on virtiofs mounts. Always. Without it, a boot where virtiofsd isn't ready yet sends the VM to emergency.target. One word in the options list.
AdGuard bound to 0.0.0.0:53 will fight libvirt's dnsmasq. Restrict AdGuard to specific interfaces before enabling VM support, or you'll be debugging VM DNS failures that are actually a port conflict on virbr0.
Tailscale makes NAT networking much less of a problem. When the VM can't reach internal LAN services because it's on a NAT subnet, Tailscale fills the gap without any extra configuration on the VM side.
nix.nixPath with home-manager pinned to master removes the manual channel step on fresh installs. Small thing, but it removes one more "oh right, I forgot to add the channel" moment from the VM setup process.

What's Next

Well, I didn't know what was next when I wrote that. VM was up, ISO was building, repos were clean — felt like a good stopping point. Then it hit me. Working headless on a server is fine until you actually want to see what's happening inside a VM. Part 14 is about fixing that — and fixing an unexpected bug the ISO build revealed along the way.

Find me at @ppb1701@ppb.social on Mastodon if you're following along, or if the virtiofs emergency.target rabbit hole cost you an afternoon too.

← Back to Series

Main Server (nixos): Codeberg
Second Server (nixos2): Codeberg
ISO can be gotten here.