Not exactly self hosting but maintaining/backing it up is hard for me. So many “what if”s are coming to my mind. Like what if DB gets corrupted? What if the device breaks? If on cloud provider, what if they decide to remove the server?

I need a local server and a remote one that are synced to confidentially self-host things and setting this up is a hassle I don’t want to take.

So my question is how safe is your setup? Are you still enthusiastic with it?

  • Saik0@lemmy.saik0.com
    link
    fedilink
    English
    arrow-up
    5
    ·
    5 months ago

    Absurdly safe.

    Proxmox cluster, HA active. Ceph for live data. Truenas for long term/slow data.

    About 600 pounds of batteries at the bottom of the rack to weather short power outages (up to 5 hours). 2 dedicated breakers on different phases of power.

    Dual/stacked switches with lacp’d connections that must be on both switches (one switch dies? Who cares). Dual firewalls with Carp ACTIVE/ACTIVE connection…

    Basically everything is as redundant as it can be aside from one power source into the house… and one internet connection into the house. My “single point of failures” are all outside of my hands… and are all mitigated/risk assessed down.

    I do not use cloud anything… to put even 1/10th of my shit onto the cloud it’s thousands a month.

    • iso@lemy.lolOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      5 months ago

      It’s quite robust, but it looks like everything will be destroyed when your server room burns down :)

      • Saik0@lemmy.saik0.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        5 months ago

        Fire extinguisher is in the garage… literal feet from the server. But that specific problem is actually being addressed soon. My dad is setting up his cluster and I fronted him about 1/2 the capacity I have. I intend to sync longterm/slow storage to his box (the truenas box is the proxmox backup server target, so also collects the backups and puts a copy offsite).

        Slow process… Working on it :) Still have to maintain my normal job after all.

        Edit: another possible mitigation I’ve seriously thought about for “fire” are things like these…

        https://hsewatch.com/automatic-fire-extinguisher/

        Or those types of modules that some 3d printer people use to automatically handle fires…

        • iso@lemy.lolOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 months ago

          Yeah I really like the “parent backup” strategy from @hperrin@lemmy.world :) This way it costs much less.

          • Saik0@lemmy.saik0.com
            link
            fedilink
            English
            arrow-up
            2
            ·
            edit-2
            5 months ago

            The real fun is going to be when he’s finally up and running… I have ~250TB of data on the Truenas box. Initial sync is going to take a hot week… or 2…

            Edit: 23 days at his max download speed :(

            Fine… a hot month and a half.

            • shiftymccool@programming.dev
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              5 months ago

              I’m doing something similar (with a lot less data), and I’m intending on syncing locally the first time to avoid this exact scenario.

    • Possibly linux@lemmy.zip
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 months ago

      You should edit you post to make this sound simple.

      “just a casual self hoster with no single point of failure”

      • Saik0@lemmy.saik0.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 months ago

        Nah, that’d be mean. It isn’t “simple” by any stretch. It’s an aggregation of a lot of hours put into it. What’s fun is that when it gets that big you start putting tools together to do a lot of the work/diagnosing for you. A good chunk of those tools have made it into production for my companies too.

        LibreNMS to tell me what died when… Wazuh to monitor most of the security aspects of it all. I have a gitea instance with my own repos for scripts when it comes maintenance time. Centralized stuff and a cron stub on the containers/vms can mean you update all your stuff in one go

    • Mora@pawb.social
      link
      fedilink
      English
      arrow-up
      0
      ·
      5 months ago

      Absurdly safe.

      […] Ceph

      For me these two things are exclusive of each other. I had nothing but trouble with Ceph.

      • Saik0@lemmy.saik0.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 months ago

        Ceph has been FANTASTIC for me. I’ve done the dumbest shit to try and break it and have had great success recovering every time.

        The key in my experience is OODLES of bandwidth. It LOVES fat pipes. In my case 2x 40Gbps link on all 5 servers.

      • Saik0@lemmy.saik0.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        5 months ago

        40 ssds as my osds… 5 hosts… all nodes are all functions (monitor/manager/metadataservers), if I added more servers I would not add any more of those… (which I do have 3 more servers for “parts”/spares… but could turn them on too if I really wanted to.

        2x 40gbps networking for each server.

        Since upstream internet is only 8gbps I let some vms use that bandwidth too… but that doesn’t eat into enough to starve Ceph at all. There’s 2x1gbps for all the normal internet facing services (which also acts as an innate rate limiter for those services).