Not exactly self hosting but maintaining/backing it up is hard for me. So many “what if”s are coming to my mind. Like what if DB gets corrupted? What if the device breaks? If on cloud provider, what if they decide to remove the server?
I need a local server and a remote one that are synced to confidentially self-host things and setting this up is a hassle I don’t want to take.
So my question is how safe is your setup? Are you still enthusiastic with it?
My incredible hatred and rage for not understanding things powers me on the cycle of trying and failing hundreds of times till I figure it out. Then I screw it all up somehow and the cycle begins again.
All of your issues can be solved by a backup. My host went out of business. I set up a new server, pulled my backups, and was up and running in less than an hour.
I’d recommend docker compose. Each service gets its own folder inside your docker folder. All volumes are a folder in the services folder. Each night, run a script that stops all of them, starts duplicati, backs up to a remote server or webdav share or whatever, and then starts them back up again. If you want to be extra safe, back up to two locations. It’s not that complicated if it’s just your own services.
I have a rack in my garage.
My advice, keep it simple, keep it virtual.
I dumpster dove for hardware and run proxmox on hosts. Not even clustered, just simple stand alone proxmox hosts. Connect to my Synology storage device and done.
I run next cloud for webDav contacts and calendar (fuck Google), it does photo and do. Storage. The next client is free from F-Droid for Android and works on debian desktops like a charm.
I run Minecraft server
I run home automation server
I run a media server.
Proxmox backs everything up on schedule
All I need to do is get off-site backup setup for Synology important data and I’m all set.
It’s really not as hard as you think if you keep it simple
Others have said this, but it’s always a work in progress.
What started out as just a spare optiplex desktop and needing a dedicated box for Minecraft and valheim servers, to now having a rack in my living room with a few key things I and others rely on. You definitely aren’t alone XD
Regular, proactive work goes a long way. I also stated creating tickets for myself, each with a specific task. This way I could break things down, have reminders of what still needs attention, and track progress.
Do you host your ticketing system? I’d like to try one out. My TODO markings in my notes app don’t end up organized enough to be helpful. My experience is with JIRA, which I despise with every fiber of my being.
I have set up forgejo, which is a fork of gitea. It’s a git forge, but its ticketing system is quite good.
Oh neat, I was actually planning to set that up to store scripts and some projects I’m working on, I’ll give the tickets a try then.
Try Vikunja, it might tick the box for you.
We built Vikunja with speed in mind - every interaction takes less than 100ms.
Their heads are certainly in the right place. I’ll check this out, thank you!
Mostly I just use nextclouds deck extension. It behaves close enough to what I need as a solo operation.
Right now I just play with things at a level that I don’t care if they pop out of existence tomorrow.
If you want to be truly safe (at an individual level, not an institutional level where there’s someone with an interest in fucking your stuff up), you need to make sure things are recoverable unless 3 completely separate things go wrong at the same time (an outage at a remote data centre, your server fails and your local backup fails). Very unlikely for all 3 to happen simultaneously, but 1 is likely to fail and 2 is forseeable, so you can fix it before the 3rd also fails.
Exactly right there with the not worrying. Getting started can be brutal. I always recommend people start without worrying about it, be okay with the idea that you’re going to lose everything.
When you start really understanding how the tech works, then start playing with backups and how to recover. By that time you’ve probably set up enough that you are ready for a solution that doesn’t require setting everything up again. When you’re starting though? Getting it up and running is enough
Gonna just stream of consciousness some stuff here:
Been thinking lately, especially as I have been self-hosting more, how much work is just managing data on disk.
Which disk? Where does it live? How does the data transit from here to there? Why isn’t the data moving properly?
I am not sure what this means, but it makes me feel like we are missing some important ideas around data management at personal scale.
I just got a mini PC and put 5 disks in it and struggling with the same…
Don’t over think it, start small, a home server. Then add stuff, you will see that it’s not that crazy.
I personally have just one home server that locally creates encrypted backups and uploads them to backblaze.
This gives me the privacy I need as everything is on my server that I own while also having the backups on a big reliable company.
It’s not perfect but it fits my threat model
Not exactly self hosting but maintaining/backing it up is hard for me. So many “what if”s are coming to my mind. Like what if DB gets corrupted? What if the device breaks? If on cloud provider, what if they decide to remove the server?
Backups. If you follow the 3-2-1 backup strategy, you don’t have to worry about anything.
Isn’t it enough to have a single offsite backup?
Yeah, that’s exactly what the 3-2-1 rule says.
- 3 copies of your data
- 2 different kinds of storage media
- 1 off-site backup
I started as more “homelab” than “selfhosted” as first - so I was just stuffing around playing with things, but then that seemed sort of pointless and I wanted to run real workloads, then I discovered that was super useful and I loved extracting myself from commercial cloud services (dropbox etc). The point of this story is that I sort of built most of the infrastructure before I was running services that I (or family) depended on - which is where it can become a source of stress rather than fun, which is what I’m guessing you’re finding yourself in.
There’s no real way around this (the pressure you’re feeling), if you are running real services it is going to take some sysadmin work to get to the point where you feel relaxed that you can quickly deal with any problems. There’s lots of good advice elsewhere in this thread about bit and pieces to do this - the exact methods are going to vary according to your needs. Here’s mine (which is not perfect!).
- I’m running on a single mini PC & a Synology NAS setup for RAID 5
- I’ve got a nearly identical spare mini PC, and swap over to it for a couple of weeks (originally every month, but stretched out when I’m busy). That tests my ability to recover from that hardware failure.
- All my local workloads are in LXC containers or VM’s on Proxmox with automated snapshots that are my (bulky) backups, but allow for restoration in minutes if needed.
- The NAS is backed up locally to an external USB that’s not usually plugged in, and to a lower speced similar setup 300km away.
- All the workloads are dockerised, and I have a standard directory structure and compose approach so if I need to upgrade something or do some other maintenance of something I don’t often touch, I know where everything is with out looking back to the playbook
- I don’t use a script or Terrafrom to set those up, I’ve got a proxmox template with docker and tailscale etc installed that I use, so the only bit of unique infrastructure is the docker compose file which is source controlled on Forgejo
- Everything’s on UPSs
- A have a bunch of ansible playbooks for routine maintenance such as apt updates, also in source control
- all the VPS workloads are dockerised with the same directory structure, and behind NGINX PM. I’ve gotten super comfortable with one VPS provider, so that’s a weakness. I should try moving them one day. They are mostly static websites, plus one important web app that I have a tested backup strategy for, but not an automated one, so that needs addressed.
- I use a local and an external UptimeKuma for monitoring, enhanced by running a tiny server on every instance that just exposes a disk free and memory free api that can be consumed by Uptime.
I still have lots of single points of failure - Tailscale, my internet provider, my domain provider etc, but I think I’ve addressed the most common which would be hardware failures at home. My monitoring is also probably sub-par, I’m not really looking at logs unless I’m investigating a problem. Maybe there’s a Netdata or something in my future.
You’ve mentioned that a syncing to a remote server for backups is a step you don’t want to take, if you mean managing your own is a step you don’t want to take, then your solutions are a paid backup service like backblaze or, physically shuffling external USB drives (or extra NASs) back and forth to somewhere - depending on what downtime you can tolerate.
TrueNAS scale helps a lot, as it makes many popular apps just a few clicks away. Or for more power-users, stuff like the linux cockpit also really helps.
To directly answer your questions…
- In the event of DB corruption (which hasn’t happened to me yet) I would probably rollback that app to the previous snapshot. I suspect that TrueNAS having ZFS as an underlayment may help in this regard, as it actually detects bitrot and bitflips, which may be the underlying cause of such corruption.
- In the case where a device breaks… if it’s a hard drive that broke, I just pop in a new one and add it to the degraded mirror set. If it’s “something else” that broke, my plan is to pop one of the mirror shards into a spare PoS computer (as truenas scale runs on common x86 hardware) and deal with the ugly-factor until I repair or replace the bigger issue.
- The only way to defend against a cloud provider is replication, so plan accordingly if that is a concern.
- If by “sync’d confidentially” you mean encrypted in transit, I’m pretty sure that TrueNAS has built in replication over SSH. If you meant TNO, then you probably want to build your setup over a cryfs filesystem so no cleartext bits hit the cloud, although on second thought… it’s not really meant for multi-master synchronization… my case just happens to fit it (only one device writes)… so there is probably a better choice for this.
- Setup is a hassle? Yes… just be sure that you invest that hassle into something permanent, if not something like a TrueNAS configuration (where the config gets carried along for the ride with the data) then maybe something like ansible scripts (which is machine-readable documentation). Depending on your organization skills, even hand-written notes or making your own “meta” software packages (with only dependencies & install scripts) might work. What you don’t want to do is manually tweak a linux install, and then forget what is “special” about that server or what is relying on it.
- How safe is my setup? Depends… I still need to start rotating a mirror shard as an offsite backup, so not very robust against a site disaster; Security-wise… I’ve got a lot of private bits, and it works for my needs… as far as I know :)
- Still enthusiastic? I try to see everything as both temporary and a work-in-progress. This can be good in ways because nothing has to be perfect, but can be bad in ways that my setup at any given time is an ugly amalgamation of different experimental ideas that may or may not survive the next “iteration”. For example, I still have centos 7 & python 2 stuff that needs to be migrated or obsoleted.
As an alternative, Unraid. While it’s paid, it strips away a lot of the hassle you mentioned in your post. Has a built in shop where you just click, set up ports/shares and docker containers just spin up for you.
While I’m not a huge fan of their recent subscription model change, I do love their OS (I got I’m still grandfathered into the pre-existing perpetual license.
Absurdly safe.
Proxmox cluster, HA active. Ceph for live data. Truenas for long term/slow data.
About 600 pounds of batteries at the bottom of the rack to weather short power outages (up to 5 hours). 2 dedicated breakers on different phases of power.
Dual/stacked switches with lacp’d connections that must be on both switches (one switch dies? Who cares). Dual firewalls with Carp ACTIVE/ACTIVE connection…
Basically everything is as redundant as it can be aside from one power source into the house… and one internet connection into the house. My “single point of failures” are all outside of my hands… and are all mitigated/risk assessed down.
I do not use cloud anything… to put even 1/10th of my shit onto the cloud it’s thousands a month.
It’s quite robust, but it looks like everything will be destroyed when your server room burns down :)
Fire extinguisher is in the garage… literal feet from the server. But that specific problem is actually being addressed soon. My dad is setting up his cluster and I fronted him about 1/2 the capacity I have. I intend to sync longterm/slow storage to his box (the truenas box is the proxmox backup server target, so also collects the backups and puts a copy offsite).
Slow process… Working on it :) Still have to maintain my normal job after all.
Edit: another possible mitigation I’ve seriously thought about for “fire” are things like these…
https://hsewatch.com/automatic-fire-extinguisher/
Or those types of modules that some 3d printer people use to automatically handle fires…
Yeah I really like the “parent backup” strategy from @hperrin@lemmy.world :) This way it costs much less.
The real fun is going to be when he’s finally up and running… I have ~250TB of data on the Truenas box. Initial sync is going to take a hot week… or 2…
Edit: 23 days at his max download speed :(
Fine… a hot month and a half.
I’m doing something similar (with a lot less data), and I’m intending on syncing locally the first time to avoid this exact scenario.
You should edit you post to make this sound simple.
“just a casual self hoster with no single point of failure”
Nah, that’d be mean. It isn’t “simple” by any stretch. It’s an aggregation of a lot of hours put into it. What’s fun is that when it gets that big you start putting tools together to do a lot of the work/diagnosing for you. A good chunk of those tools have made it into production for my companies too.
LibreNMS to tell me what died when… Wazuh to monitor most of the security aspects of it all. I have a gitea instance with my own repos for scripts when it comes maintenance time. Centralized stuff and a cron stub on the containers/vms can mean you update all your stuff in one go
What does your internal network look like for ceph?
40 ssds as my osds… 5 hosts… all nodes are all functions (monitor/manager/metadataservers), if I added more servers I would not add any more of those… (which I do have 3 more servers for “parts”/spares… but could turn them on too if I really wanted to.
2x 40gbps networking for each server.
Since upstream internet is only 8gbps I let some vms use that bandwidth too… but that doesn’t eat into enough to starve Ceph at all. There’s 2x1gbps for all the normal internet facing services (which also acts as an innate rate limiter for those services).
Absurdly safe.
[…] Ceph
For me these two things are exclusive of each other. I had nothing but trouble with Ceph.
Ceph has been FANTASTIC for me. I’ve done the dumbest shit to try and break it and have had great success recovering every time.
The key in my experience is OODLES of bandwidth. It LOVES fat pipes. In my case 2x 40Gbps link on all 5 servers.
First of all ignore the trends. Fuck docker, fuck nixos, fuck terraform or whatever tech stack gets shilled constantly.
Find a tech stack that is easy FOR YOU and settle on that. I haven’t changed technologies for 4 years now and feel like everything can fit in my head.
Second of all, look at the other people using commercial services and see how stressed they are. Google banned my account, youtube has ads all the time, the app for service X changed and it’s unusable and so on.
Nothing comes for free in terms of time and mental baggage
Yes, you should use something that makes sense to you but ignoring docker is likely going to cause more aggravation than not in the long term.
Yep, I went in this direction…until I gave in during a bare metal install of something…
Docker is not hassle free but usually most setup guides for apps are much much easier with docker
Docker/Podman or any containerized solution is basically the easiest way to get really nice maintenance properties like: updating one app won’t break others, won’t take down the whole system, can be moved from machine to machine.
Containers are a learning curve but I think very worth it for home setups. Compared to something like Kubernetes which I would say is less worth it unless you already know or want to learn Kubernetes.
Docker takes a lot of the management work out of the equation as many of the containers automatically update. Manual updates are as simple as recreating a container with a new image instead of your local one. I would like to add try running Portainer (a graphical management interface for Docker). Breaking out the various options into a GUI helped me learn the ins and outs of Docker better, plus if you end up expanding to multiple docker hosts you can manage them all from one console. I have a desktop, a laptop, and a RPi 4b all running various dockers and having a single pane for management is such a convenience.
Not to mention the advantage of infrastructure as code. All my docker configs are just a dozen or so text files (compose). I can recreate my server apps from a bare VM in just a few minutes then copy the data over to restore a backup, revert to a previous version or migrate to another server. Massive advantages compared to bare metal.
Docker is not a shill tech stack. It is a core developer tool that is certainly not required, but is certainly not fluff
Immutable Nixos. My entire server deployment from partitioning to config is stored in git on all my machines.
Every time I boot all runtime changes are “wiped”, which is really just BTRFS subvolume swapping.
Persistence is possible, but I’m forced to deal with it otherwise it will get wiped on boot.
I use LVM for mirrored volumes for local redundancy.
My persisted volumes are backed up automatically to B2 Backblaze using rclone. I don’t backup everything. Stuff I can download again are skipped for example. I don’t have anything currently that requires putting a process in “maint mode” like a database getting corrupt if I backup while its being written to. When I did, I’d either script gracefully shutting down the process or use any export functionality if the process supported it.
Automate as much as possible. I rsync to both an online and home NAS for all of my hosted stuff, both at home and in the cloud. Updates for the OS and low level libraries are automated. The other updates are generally manual, that allows me to set aside time for fixing problems that updates might cause while still getting most of the critical security updates. And my update schedules are generally during the day, so that if something doesn’t restart properly, I can fix it.
Also, whenever possible I assume a fair amount of time for updates, far beyond what it should actually take. That way I won’t be rushed to fix the problem and end up having to revert to a backup and find time later to redo it. Then most of the time I have extra time for analyzing stats to see if I can improve performance or save money with optimizations.
I’ve never had a remote provider just suddenly vanish though I use fairly well known hosts. And as for local hardware, I just have to do without until I can buy a replacement. Or if it’s going to be some time, I do have old hardware that I could set up as a makeshift, temporary replacement like old desktop computers and some hardware that I use for experimenting like my Le Potato that isn’t powerful enough for much, but ok for the short term.
And finally I’ve been moving to more container-based setups that are easier to get up and running again. I’ve been experimenting with Nomad, Docker Swarm, K3s, etc., along with Traefik and some other reverse proxies so o can keep the workers air-gapped for security.
My advice would be, be pragmatical, an error on a backup script I did not notice wiped the time tracking data I had been collecting on my self hosted database for over a year. I got really anxious at first, because of my mistake and because of the data lost. But at the end of the day… Who cares, life goes on, this is only a hobby.
Not safe at all. I look for robustness. I prefer thinking about things that do not break easily (like ZFS and RAIDZ) instead of “what could possibly go wrong”
And I have never quite figured out how to do restores, so I neglect backups as well.