OpenShift day-two operations for application teams
Upgrades, monitoring, logging, quotas, backup mindset, and when to escalate to the platform team — the work that starts after the first deploy succeeds.
I work with Kubernetes, OpenShift and automation — and I still learn something new most weeks.
Freiburg area, South Baden · Kubernetes & platform work · former airline pilot
01
Start with the beginner series — mental models, the supermarket analogy, then the full path on the blog.
Full beginner series (18 posts)OpenShift series (8 posts)kubectl & tools series (7 posts)Contact
02
I'm Marc Wilnauer — a DevOps engineer with a few years of hands-on work in Kubernetes, OpenShift, CI/CD, and monitoring, plus smaller React and Node.js projects when they come up. I can hold my own in that stack; I still learn something new most weeks.
01
Stabilise
Stop the bleeding before you chase root cause.
02
Gather facts
Metrics, logs, recent changes — say what you see.
03
Change one thing
Reversible steps, communicated clearly.
Flying taught me to respect procedures when you're tired, to say clearly what you see, and to fix one problem at a time instead of guessing. Kubernetes incidents feel different from an approach briefing, but the habit is similar: stabilise, gather facts, communicate, then change something.
Before tech I flew for the Lufthansa Group. That part of my life is over, but it still influences how I approach incidents and checklists. I'm not selling a big consulting package here — just sharing what I do and writing down things I find useful.
I don't treat aviation as a marketing story — it's just part of why I prefer reversible deploys, honest post-mortems, and metrics I can trust before I scale traffic. I'm still figuring out plenty in tech; the cockpit background is one lens, not a shortcut to being right.
03
Day-to-day cluster work, deployments, GitOps — and the day-2 surprises I am still learning from.
Pipelines, Helm, Argo CD, sealed-secrets — trying to keep changes boring and reversible.
Prometheus, Grafana, and load tests (Gatling) when we need to know how something behaves under pressure.
React/Node.js apps, internal dashboards, and tooling — often with Cursor or Windsurf in the loop.
04
Kubernetes with Red Hat extras — Routes, SCCs, oc, ImageStreams, GitOps, and day-2 ops. A separate series on the blog.
05
Tools change; the point is to understand what you're running and how to roll back when it misbehaves.
06
Upgrades, monitoring, logging, quotas, backup mindset, and when to escalate to the platform team — the work that starts after the first deploy succeeds.
Argo CD on OCP, the OpenShift GitOps operator, app-of-apps cautions, sync versus platform guardrails, and drift on managed clusters — without pretending Git is the whole story.
What ImageStreams are for, how BuildConfigs produce tags, S2I vs Dockerfile builds, and when to skip in-cluster builds and pull from an external registry instead.
If something here resonates — a cluster problem, a side project, or just swapping notes — I'd be glad to hear from you.