Upgrading a Busy Debian 12 Server to Debian 13
I had been postponing the upgrade for a while.
The server was not a disposable VM, not a fresh install, and definitely not one of those “just nginx and a static site” boxes that make upgrade guides look easy. It was a real VPS, the kind that slowly becomes a small city: CloudPanel, nginx, Percona/MySQL, Docker, Portainer, PostgreSQL, MongoDB, Node.js, Tailscale, Quassel, Certbot, a pile of external APT repositories, abandoned virtual hosts, old certificates, and a few services I had almost forgotten existed.
In other words, exactly the kind of server where a Debian major upgrade can go from routine maintenance to archaeology with consequences.
I started with the obvious safety net: a fresh VPS snapshot and backups. That was non-negotiable. I also made sure I had VNC console access, rescue mode, and SSH. If you are doing a remote distribution upgrade without at least one out-of-band recovery path, you are not being brave; you are being reckless.
The target was Debian 13 “Trixie”. The source system was Debian 12 “Bookworm”. The plan was simple enough: clean the current system, disable third-party repositories, switch the Debian sources, upgrade in stages, fix whatever broke, reboot, then reintroduce external repositories one by one.
The actual process was, naturally, less elegant.
Cleaning Before the Upgrade
Before touching the Debian release files, I checked the system state. There were broken certificate renewals, stale nginx configurations, and a few domains that no longer existed. Two old hosts, chat.pablo.space and fit.portalidea.dev, were removed from the active nginx/certificate setup. Another dead site config, lists.sk.1208.pro.conf, had to go as well.
The certificate setup was messier than expected. One domain, chat.pablomurad.com, was not a website at all. It was used by my Quassel Core. Certbot still had it configured as if it should renew through a Cloudflare DNS challenge, even though the domain was not managed that way anymore. That was a leftover from some earlier configuration.
I reissued the certificate using nginx validation instead. Certbot updated the renewal configuration correctly, and the dry run started passing again. Since Quassel uses its own certificate file under /var/lib/quassel/quasselCert.pem, I confirmed that it had the new certificate and that Quassel was still running with SSL required on port 4242.
That was the first reminder of the day: certificates are never just certificates. They are usually connected to some service you forgot was special.
Repositories: The Real Risk
The server had several external repositories: Docker, Tailscale, NodeSource, PostgreSQL PGDG, MongoDB, GitHub CLI, CubeCoders, Percona-related packages, and CloudPanel.
For a major Debian upgrade, leaving all of those enabled is asking for dependency chaos. I disabled the external repositories and kept the CloudPanel repository, switching it from Bookworm to Trixie once I confirmed CloudPanel had Debian 13 support.
Then I replaced the Debian sources with Trixie:
deb http://deb.debian.org/debian trixie main contrib non-free non-free-firmware
deb http://deb.debian.org/debian trixie-updates main contrib non-free non-free-firmware
deb http://deb.debian.org/debian-security trixie-security main contrib non-free non-free-firmware
After apt update came back clean, I moved forward.
The Two-Stage Upgrade
I did not jump straight into apt full-upgrade. First came the safer stage:
apt upgrade --without-new-pkgs
This pulled in the essential base changes first: apt, libc, and other core packages. It also triggered service restarts through needrestart. PostgreSQL, nginx, MySQL, PHP-FPM, and other services were restarted along the way.
That is where nginx broke.
The problem was an old PageSpeed module:
/usr/share/nginx/modules/ngx_pagespeed.so
The new nginx environment did not like it. Once I disabled the module, nginx still failed because old pagespeed directives remained in nginx.conf and a site config. Removing the module was not enough; I had to comment out the remaining directives too.
After that, nginx -t passed and nginx came back.
That was the second reminder: when a binary module breaks, its configuration often keeps breaking things after the module itself is gone.
There was also a logrotate failure. It turned out to be a permissions issue on a Mailman-related logrotate file:
/etc/logrotate.d/mailman3-web-cron
The file was group-writable, and logrotate refused to process it. Changing it back to 0644 fixed the issue.
Once nginx and logrotate were clean, I moved on to the full upgrade:
apt full-upgrade
The full upgrade mostly completed, but dpkg reported a failure related to dovecot-core. Dovecot had moved to a newer configuration format and now expected dovecot_config_version at the top of its config. I did not actually need Dovecot for this server’s current mail flow, so I postponed that decision until after the system was stable.
An apt -f install finished the interrupted configuration. Then another apt full-upgrade cleaned up the remaining package state.
At that point, Debian reported:
13.4
The new Debian 13 kernel was installed, but the system was still running the old Debian 12 kernel until reboot.
The Reboot
After the reboot, the machine came back on the Debian 13 kernel:
6.12.85+deb13-amd64
That was the moment I actually believed the upgrade had worked.
The important services were up:
- nginx
- CloudPanel’s nginx
- CloudPanel PHP-FPM
- MySQL/Percona
- Docker
- Quassel Core
There were no failed systemd units after the reboot. CloudPanel was installed as the Trixie package. Certbot had already been cleaned earlier. The server was alive.
Dovecot was still not worth saving, so I removed it. Postfix stayed, because SMTP delivery is a different thing from IMAP/POP mailbox access, and I did not want to remove anything that applications might still use for outbound mail.
Reintroducing External Repositories
Once the base system was stable, I reintroduced the external repositories one by one.
Docker came first, using the Debian 13 repository. Then Tailscale, NodeSource, GitHub CLI, PostgreSQL PGDG, and CubeCoders/AMP. I did not blindly re-enable the old Percona repository, because CloudPanel was already managing the MySQL/Percona side of the system. Mixing an old Percona Bookworm repo into a Trixie system would have been a stupid way to break a database.
MongoDB was treated cautiously. If an upstream repository does not clearly support the new Debian release yet, I do not pretend it does just because I want the upgrade checklist to look complete.
PostgreSQL needed a small follow-up upgrade from PGDG Debian 12 builds to PGDG Debian 13 builds. That completed cleanly.
Tailscale, DNS, and a Docker Surprise
Tailscale had one annoying leftover: DNS.
Even after setting:
tailscale set --accept-dns=false --accept-routes=false
the system still had /etc/resolv.conf generated by Tailscale and pointing to:
nameserver 100.100.100.100
That eventually broke a user’s Docker build. The build failed while trying to pull python:3.11-slim, with Docker unable to resolve auth.docker.io through Tailscale DNS.
The fix was to replace /etc/resolv.conf with normal public resolvers and ensure Tailscale was no longer managing DNS. After that, name resolution worked again.
This was a good example of a bug that looked like Docker, looked like Python, looked like Docker Hub, but was really just DNS.
Docker Got Messy
Restarting Docker on a server with many containers is never as harmless as it sounds.
After the DNS fix, Docker spent a long time reconciling old containers, stale sandboxes, containerd shims, and networks. There were messages like:
sandbox not found
task already exists
failed to create task for container
Portainer appeared to be down for a while because the Docker daemon itself was stuck halfway between stopping and starting. The right move was not to keep hammering docker ps or restarting Docker repeatedly. The right move was to inspect systemd, containerd, and the Docker logs, then perform a clean daemon restart if needed.
Docker eventually stabilized, but it was a reminder that Docker is not one thing. It is dockerd, containerd, shims, proxies, networks, volumes, restart policies, and whatever state was left behind by the previous process.
Portainer
Portainer itself was straightforward once Docker was healthy.
It was already using the correct persistent volume:
portainer_data:/data
So updating it meant pulling the current LTS image, removing the old container, and recreating it with the same volume and Docker socket:
docker pull portainer/portainer-ce:lts
docker stop portainer
docker rm portainer
docker run -d \
--name portainer \
--restart=always \
-p 9443:9443 \
-v /var/run/docker.sock:/var/run/docker.sock \
-v portainer_data:/data \
portainer/portainer-ce:lts
No drama there. The important part was not losing the volume.
Hardening SSH After the Upgrade
Once the system was stable, I also cleaned up SSH access.
I generated a dedicated Ed25519 key for the server instead of overwriting my existing default SSH key. Then I imported that private key into Termius and confirmed root login worked with the key.
I wanted root to require a key, but I did not want to disable password login for all users. That distinction matters. Setting PasswordAuthentication no globally would block password authentication for everyone.
The final SSH configuration was:
PubkeyAuthentication yes
PasswordAuthentication yes
KbdInteractiveAuthentication yes
PermitRootLogin prohibit-password
That gives the desired behavior:
- root can log in with a key
- root cannot log in with a password
- normal users can still use passwords
I tested root password login explicitly, and it failed as expected.
I also installed Fail2Ban. It immediately started banning IPs trying to brute-force SSH. That was not surprising. Any public VPS gets scanned constantly. The difference is that now root password login is dead, and repeated attackers get banned.
What Broke
The upgrade itself was successful, but several things needed attention:
- stale nginx hosts
- broken Certbot renewal configuration
- Quassel certificate handling
- old PageSpeed nginx module
- leftover PageSpeed directives
- logrotate permissions
- Dovecot 2.4 config incompatibility
- Tailscale DNS residue
- Docker/containerd stale runtime state
- external repositories needing careful reintroduction
- SSH security cleanup
None of these were exotic. They were exactly the kind of boring, accumulated server issues that only show up during a major upgrade.
What Worked
The staged approach worked.
The snapshot mattered. The VNC/rescue access mattered. Disabling third-party repositories mattered. Doing apt upgrade --without-new-pkgs before apt full-upgrade mattered. Testing services before reboot mattered. Reintroducing repositories one at a time mattered.
Most importantly, I did not treat the server like a lab machine. This was a production-ish VPS full of unrelated services, and the upgrade had to respect that.
Final State
The server ended up on Debian 13.4, running the Debian 13 kernel, with the main services alive:
- CloudPanel
- nginx
- MySQL/Percona
- Docker
- Portainer
- PostgreSQL
- Node.js
- GitHub CLI
- Tailscale
- Quassel
- Certbot
- Fail2Ban
Dovecot was removed because it was not needed. PageSpeed was disabled because the old nginx module was not worth saving. Tailscale DNS was disabled because the VPS should not depend on MagicDNS for normal public resolution. Docker was brought back after some runtime cleanup pain.
It was not a clean upgrade in the blog-post sense. It was a real one.
And honestly, that is the useful lesson: the commands are only half the upgrade. The other half is knowing which breakages matter, which ones are old garbage finally surfacing, and which ones should be left alone until the machine is stable again.