Calum MacRae
- hi@cmacr.ae
- cmacr.ae
- @cmacrae
- London
I'm a Platform Engineer with over seven years professional experience. I have a strong passion for Open Source software development and Site Reliability Engineering, with an aptitude for creative, collaborative problem solving. My journey in software & platform engineering started as a hobby, and remains one of the most fulfilling, captivating parts of my life, both inside and outside of work.
Work Experience
Senior Platform Engineer
Joining MOO as a Senior Platform Engineer, I have helped lead the team in developing high standard infrastructure and tooling to deliver exceptional service to both our internal & external customers. I have flourished in the role - mentoring other engineers, leading big-picture projects, and established a friendly, approachable reputation cross-company for the team.
Self-hosted Kubernetes migration to EKS
Initially at MOO, we were hosting our own Kubernetes clusters, built leveraging
kops
. It was out of date, significantly lagging behind the release cycle of Kubernetes and had gained some stigma around the dangers of upgrades, due to failed attempts in the past. I banded together with other senior engineers and formulated a plan for migrating to Amazon's EKS, with a major overhaul of our software delivery pattern; adopting GitOps. As part of the migration, we took it upon ourselves to introduce best-in-practice solutions, such as: ArgoCD & the “app-of-apps” model, the Prometheus Operator, OIDC token leasing for AWS IAM role assumption/resource access,descheduler
to automatically manage workload distribution.Improvement of legacy service auto-provisioning
Legacy components of the MOO infrastructure are provisioned using a large and complex Ansible codebase. As one of my first projects, I restructured the automatic provisioning mechanism to use concurrent, local provisioning. This was low-touch, eliminated bottlenecks, and had a tech-wide impact - minimising deployment time - allowing for faster deployments and quicker rollbacks.
Cross-crew federation for service migration
Part of our team's drive in tech was to influence other engineers to develop their services with high quality, self-service platform operation in mind. To aid in this goal, I joined one of our software teams for a week to help migrate one of their services from a legacy deployment model to Kubernetes. I facilitated this federation with pairing practices, ensuring the knowledge and workflow were shared with the team, so they could adopt this approach for future projects.
GitLab Merge Request dashboard
On Friday, fortnightly, our team enjoys “Free Time Fridays”, in which we get to work on a project of our liking, so long as it is loosely related to work. One such project I created was a web frontend, written in Go, to consolidate a team's collective open Merge Requests in GitLab. This was to tackle a problem with tracking which Merge Requests required reviews, spanning other team's projects, with a rich front-end displaying essential information around each request. I have since made it Open Source, with acompanying best-practice deployment mechanisms for Kubernetes. It is now widely adopted across MOO tech teams.
DevOps Engineer
I joined Blink as their sole DevOps engineer, tasked with transforming the infrastructure their engineers had built into a robust, declarative, highly available set of systems, and developing tooling to enable the engineers to deliver their work in a fast, safe, and scalable manner.
Moving Blink to declarative configuration & CI/CD
When joining Blink, all of the infrastructure in place had been created using the AWS web dashboard. I endeavoured to capture the entirety of the platform in Terraform. This includes the development of many in-house modules for easy deployment of software and infrastructure components. This has enabled version control, and developers to get their own services deployed with next-to-no effort. Taking their code from their laptop to a fully declarative service out in the cloud in varying environments in just a few lines of configuration that lives directly in their project's repository.
Automated UI integration test system
Working closely with our QA engineer, designed and implemented a UI test automation workflow to integrate into our development pipelines. Utilising Go to build a custom launcher that integrated with a 3rd party service to inject data from the build environment into the remote UI testing environment, and finally report back into our messaging platform with the results.
Messaging analytics framework
Collaborating with developers to design and deploy an in house analytics stack based on Open Source software like Kafka, Cassandra, and Elasticsearch to index & expose rich metadata for the Blink platform. It slowly evolved over time to become an intricate, dynamic system utilising Consul for service discovery, configuration, and self-repair.
Dynamic secrets management
Initially, Docker services hosted on ECS at Blink were using environment variables for their configuration and secrets. This was clearly a security concern. I designed an integrated secrets management framework leveraging Vault, Consul, IAM & STS. This allowed for seamless secret/config consumption - any services using the framework automatically authenticated with Vault & Consul.
Systems & Networks Administrator
Maintenance & development of self-hosted compute, desktop, and network systems.
- Design and implementation of system and network deployments for both private infrastructure and hundreds of client systems
- Technical support for global systems and networks of varying environments (production, staging, demo, development, local office)
- Identifying bottlenecks in workflow and systems efficiency and providing solutions
- General support of all staff systems within the office
- Resident Ansible/Puppet/Git adept
Technical Support Agent
Frontline Technical Support Agent for British Midland International/Vodafone
- Providing technical support for the telephony and IT systems of various locations around the world for over 3000 members of staff
- Liaising with various departments of the company and 3rd parties to assist with managing projects for the IT systems
- Triaging of particularly technical faults highlighted by other agents
Projects
Pantheon: a fully declarative software platform built on Kubernetes & NixOS
For years I've developed and maintained some form of home infrastructure. The latest incarnation of which I have built from scratch, writing a large collection of Nix modules for use with NixOS. It achieves a fully declarative, functional approach to systems deployment, yeilding a high quality Kubernetes cluster. I have largely mimicked the offerings of cloud provider solutions on the metal, including (but not limited to) automatic node registration using a service written in Go, data persistence with GlusterFS, dynamic DNS with external-dns & PowerDNS, LAN routable access to k8s services with LoadBalancer objects using MetalLB, LetsEncrypt certificate leasing with cert-manager, ArgoCD with the "app-of-apps" model for deployment.
Longbow: a Discogs notifier providing near instantaneous alerts for rare vinyl records going on sale
Another passion of mine is collecting vinyl records, in particular - rarities. I found the Discogs notification mechanism to be lacklustre, often notifying followers to the sale of a record, only to find it had already been purchased. I set out to write my own high-speed, low-latency notifier - written in Go, deployed to Kubernetes. It has proven to be an excellent source of learning and enjoyment. In my endeavours to circumvent Discogs' throttling restrictions, two libraries have been born, which I restructured the tool to rely upon. One such library is 'haunter', found on my GitHub, providing easily composable HTTP clients with configurable retry logic and request distribution via a fleet of proxies, exposing Prometheus metrics for request observability (and I'm currently implementing tracing with OpenTelemetry). Another; 'dcogs' the distributed Discogs API wrapper, built to interrogate the vinyl marketplace in high volumes. The result is direct push notifications to my devices in ~7s of a record appearing for sale, with artwork, release information, and a link to the item for sale.
d2-prometheus-exporter: Destiny 2 player metadata exported to Prometheus
I enjoy playing Destiny 2, a Sci-Fi/Fantasy first person shooter from Bungie, in my spare time. One great thing Bungie have done from the get-go is expose rich, intricate metadata about player activity via a public API. With so much information available, I thought it'd be fun to translate that data for consumption in Prometheus. This allows for easy, fluid breakdown of activity behavior which you can then build mathmatical queries on to form your own derived statistics, or represent in a dashboard with Grafana. Think leaderboards, trend & performance tracking. You could even set up alerts around collected metrics.
spacebar: a minimal statusbar for macOS
A dead simple status bar for macOS, written in Objective C, to acompany tiling window managers - such as yabai.
Nix/NixOS contributor/package maintainer
I maintain a few Open Source packages for the Nix packages project (nixpkgs) and also contribute to other Nix related projects (NixOS & nix-darwin).
Other miscellaneous Open Source projects/contributions
I maintain and contribute to a wide array of Open Source projects. Most information on my Open Source work can be found on my GitHub.