Lab Infra Rebuild Part 6

This is the final in a series of posts about rebuilding my lab infrastructure, see the initial post here.

Today we’ll wrap things up with a look at some SaaS tools I use and a general look through some small utilities and things I use to bring it all together.

I’ve been enjoying my summer for the last 3 months hence the hiatus of posts.

Email

Long ago I used to run my own Zimbra but I gave up on that a few years ago, been with Fastmail ever since and really happy with their service.

Delivering email from my 20+ VMs all over the world though is tricky, getting all the various DKIM and other settings right for a big set of IP addresses and ranges quickly becomes a maintenance nightmare. But ther’s a constant trickle of stuff from them - cron job, monitoring, backup statusses and more.

After some looking around at options I found SMTP2GO who have a very generous 1000 / month free tier. This is usually fine for me but I ended up paying them anyway for a annual account. This way I have just one egress point to consider in my various email policy setups and thus far, for delivering system emails, this has been a great time saver.

Read on about DNS, Git, SSO and more.

[]

Lab Infra Rebuild Part 5

This is an ongoing post about rebuilding my lab infrastructure, see the initial post here.

Today let’s look at VMs and Baremetals and Operating Systems.

Virtual Machines

As I mentioned I’ve been a Linode customer since essentially their day one and have had many 100s of machines there for personal and client needs.

I soured off them quite significantly after they botched their Kubernetes release by not having Europe side support past basic triage initially that led to multiple multi hour outages and so I have been making some changes.

My most recent incarnation had about 5 or 6 Linode machines - Puppet Server, Choria Repos * 3, 2 x DNS, General Services machine.

Today I have 2 Linode machines left. I was reluctant to move them as they were DNS servers for many domains but I changed my way of hosting domains also during this so less of a concern now.

They are now just RPM/Deb repos for Choria and I will move those elsewhere soon also. That’ll be the first time I am without Linode machines since basically 2003, such a shame. One is in the US to provide a nearby mirror there, I might keep it and just scale it down to lower spec. But with the recent changes at Linode it feels a bit like it’s time to consider alternatives.

Previously I had Digital Ocean droplets x 3 for my Kubernetes cluster, as discussed that’s all gone now too.

I used to have quite a selection of Vultr machines, I don’t recall why I left them really I think I felt Linode was just the rolls royce of this kind of Cloud provider and so consolidated to simplify my business accounting etc

Baremetal

In the previous iteration I had only 1 hosted physical machine and that was my Backups machine running Bacula on a Hetzner SX64 (64GB RAM Ryzen 5 3600) with 4 x 16 TB SATA Enterprise HDD 7200rpm. I do not need much from this machine, it wakes up, do backups then sleep again till tomorrow. So the spinning rust is fine for that, I just need lots of it. I rebuilt on a new one just to get a hardware and OS refresh.

Of course I do still use Virtual Machines just managed by Cockpit as per the 2nd part in this series.

I got a pair of Hetzner AX41-NVMe (64GB RAN Ryzen 5 3600) with 2 x 512 NVMe SSDs each. I was expecting to add a 3rd but really these 2 plus the Ryzen at my office turns out to be plenty for my needs. They have some upgrades available - more RAM, Extra Disks, SATA can be added etc. I don’t know if Hetzner supports upgrading running machines but this is nice little platform. At EUR37 per machine that puts them between a 4GB and 8GB shared CPU Linode. You really can’t complain. I might get a 3rd one just for the sake of it and spread my development machines out more. Something for after the summer.

Performance wise moving from my Droplets and other VMs to these machines have been amazing, for an investment equalling 1 Linode I can run several VMs with no additional cost to expand to another VM or to shuffle memory allocations - or just to allocate more since 64GB is way more than I need. This really is a no brainer for my needs.

I’ve been a Hetzner customer also since a very long time, it’s not clear how long but it feels like maybe 2008 or so. They’ve had their ups and downs and dodgy datacenters, dodgy connectivity and dodgy hardware, bad english support, but in the past few years I think they’re really firmed up quite nicely and my machines there have not given me trouble so I felt it’s safe to lean on them a bit more. A few months in now and I’ve not had one minute of problems.

Read on about Operating Systems and more.

[]

Lab Infra Rebuild Part 4

This is an ongoing post about rebuilding my lab infrastructure, see the initial post here.

Today I’ll talk about my physical office and office hardware.

Office Space

When my son started going to school I did not look forward to all the driving so figured a office near his school would be good, I’d spend the days there and come home after pick up. I rented a nice place in a town called Mosta, it was nice and had ample storage and would have made a really great maker space as it had about 4 car garages worth of underground storage that was well lit and ventilated.

Unfortunately this place was opposite a school and parking was absolute hell. I ended up just not using it for months at a time since I could drive home in 12 minutes or spend 45 minutes finding parking, no thanks. I gave up trying to find a garage to rent around there, it’s just crazy.

When I started looking the 2nd place in Żebbuġ I saw seemed perfect, a fantastic bright 7m x 7m office with a underground 1.5 car garage at a reasonable price. I took that and just recently extended my lease to 5 years. My sister-in-law is also moving her business to the same street and there’s a 3D printer shop around the corner, bonus. Parking is always easy even on the street and it’s like 5 minutes from my house.

Here I am able to set up 3 workstations, my main desk, one for guests with a little Desktop for my son and big work station for soldering, assembling IoT projects and such. I also have a nice 2 seater sofa with a coffee table. With the garage I have space to put things like a lazer cutter and more in future - though I am eyeing a small space across the road as a workshop space also for that kind of thing.

Location wise I could not want more, it’s a easy walk or cycle from my house through a beautiful car-free valley and the town is pleasant enough with food options and a small corner shop nearby.

Read on for more about the hardware.

[]

Lab Infra Rebuild Part 3

This is an ongoing post about rebuilding my lab infrastructure, see the initial post here.

Today I’ll talk a bit about Configuration Management having previously mentioned I am ditching Kubernetes.

Server Management

The general state of server management is pretty sad, you have Ansible or Puppet and a long tail of things that just can’t work or are under terrible corporate control that you just can’t touch them.

I am, as most people are aware, a very long term Puppet user since almost day 1 and have contributed significant features like Hiera and the design of Data in Modules. I’ve not had much need/interest in being involved in that community for ages but I want to like Puppet and I want to keep using it where appropriate.

I think Puppet is more or less finished - there are bug fixes and stuff of course - but in general core Puppet is stable, mature and does what one wants and have extension points one needs. There’s not really any reason not to use if it fits your needs/tastes and should things go pear shaped its a easy fork. One can essentially stay on a current version for years at this point and it’s fine. There used to be some issues around packaging but even this is Apache-2 now. If you have the problem it solves it does so very well, but the industry have moved on so not much scope for extending it imo.

All the action is in the content - modules on the forge. Vox Pupuli are doing an amazing job, I honestly do not know how they do so much or maintain so many modules, it’s really impressive.

There’s a general state of rot though, modules I used to use are almost all abandoned with a few moved to Vox. I wonder if stats about this is available, but I get the impression that content wise things are also taking a huge dive there and Vox are holding everything afloat with Puppet hoping to make money from selling commercial modules - I doubt it will sustain a business their size but it’s a good idea.

Read on for more about Puppet today in my infra.

[]

Lab Infra Rebuild Part 2

Previously I blogged about rebuilding my personal infra, focussing on what I had before.

Today we’ll start into what I used to replace the old stuff. It’s difficult to know where to start but I think a bit about VM and Container management is as good as any.

Kubernetes

My previous build used a 3 node Kubernetes Cluster hosted at Digital Ocean. It hosted:

  • Public facing websites like this blog (WordPress in the past), Wiki, A few static sites etc
  • Monitoring: Prometheus, Grafana, Graphite
  • A bridge from The Things Network for my LoRaWAN devices
  • 3 x redundant Choria Brokers and AAA
  • Container Registry backed by Spaces (Digital Ocean object storage)
  • Ingress and Okta integration via Vouch
  • Service discovery and automatic generation of configurations for Prom, Ingress etc

Apart from the core cluster I had about 15 volumes, 3 Spaces, an Ingress load balancer with a static IP and a managed MySQL database.

I never got around to go full GitOps on this setup, it just seemed too much to do for a one man infra to both deploy all that and maintain the discipline. Of course I am not a stranger to the discipline required being from Puppet world, but something about the whole GitOps setup just seemed like A LOT.

I quite liked all of this, when Kubernetes works it is a pleasant experience, some highlights:

  • Integration with cloud infra like LBs is an amazing experience
  • Integration with volumes to provide movable storage is really great and hard to repeat
  • I do not mind YAML and the diffable infrastructure is really great, no surprise there. I hold myself largely to blame for the popularity of YAML in infra tools at large thanks to Hiera etc, so I can’t complain.
  • Complete abstraction of node complexities is a double-edged sword but I think in the end I come to appreciate it
  • I do like the container workflow and it was compatible with some pre-k8s thoughts I had on this
  • Easy integration between CI and infrastructure with the kubectl rollout abstraction

Some things I just did not like, I will try to mention some things that not the usual gripes:

  • Access to managed k8s infra is great, but not knowing how its put together for the particular cloud can make debugging things hard. I had some Cilium failures that was a real pain
  • API deprecations are constant, production software rely on Beta APIs and will just randomly break. I expected this, but over the 3 years this happened more than I expected. You really have to be on top of all the versions of all the things
  • The complimentary management tooling is quite heavy like I mentioned around GitOps. Traditional CM had a quick on-ramp and was suitable at small scale, I miss that
  • I had to move from Linode K8s to Digital Ocean K8s. The portability promises of pure kubernetes is lost if you do not take a lot of care
  • Logging from the k8s infra is insane, ever-changing, unusable unless you really really are into this stuff like very deep and very on-top of every version change
  • Digital Ocean does forced upgrades of the k8s, this is fine. The implication is that all the nodes will be replaced so Prometheus polling source will change with big knock on effect. The way DO does it though involves 2 full upgrades for every 1 upgrade doubling the pain
  • It just seem like no-one wants to even match the features Hiera have in terms of customization of data
  • Helm

In the end it all just seemed like a lot for my needs and was ever slightly fragile. I went on a 3 month sabbatical last year and the entire infra went to hell twice, all on its own, because I neglected some upgrades during this time and when Digital Ocean landed their upgrade it all broke. It’s a big commitment.

See the full entry for detail of what I am doing instead.

[]

Lab Infra Rebuild Part 1

I’ve been posting on socials a bit about rebuilding my lab and some opinions I had on tools, approaches and more. Some people have asked for a way to keep up with my efforts, so I figured it might be time to post here for the first time since 2018!

In this post I’ll focus on what came before, a bit of a recap of my previous setup. Additionally, to a general software refresh I have also been in Malta now 8 years and a lot of my office hardware was purchased around the time of moving here, so we’ll also cover replacing NAS servers and more.

My previous big OS rebuilt was around CentOS 7 days, so that’s about 3 years ago now, high time to revisit some choices.

My infra falls in the following categories:

  • Development machines: usually 10 or so Virtual Machines that I use mainly in developing Choria.io
  • Office support equipment: NAS, Printers, Desktops, Laptops etc
  • Networking equipment: mainly home networking stuff for my locations
  • Hosting publicly visible items: This blog, DNS, Choria Package repos etc
  • Management infrastructure: Choria, Puppet, etc
  • Monitoring infrastructure: Prometheus and friends
  • Backups: for everything
  • General all-purpose things like source control etc
  • My actual office

Below I’ll do a quick run through all the equipment, machines, devices etc. I use regularly. I’ve largely replaced it all and will detail that in the following posts. It’s not huge infra or anything, all told about 20 to 30 instances in 5 or 6 locations.

[]