Job Description
Job Description
Infrastructure Engineer | Remote
We're looking for an infrastructure engineer who can own and scale cloud-native systems in production. You'll manage Kubernetes clusters, optimize distributed databases, and build self-service tools that let dev teams move fast without breaking things.
What you'll actually do:
- Manage cloud infrastructure (AWS/GCP/Azure) using IaC — Terraform, Helm, FluxCD/ArgoCD
- Handle distributed database backup/restore operations at scale
- Right-size resources based on real usage data and optimize cloud spend
- Build observability frameworks (logs, metrics, traces) and respond to incidents
- Cut permission boundaries so teams can self-manage without creating security holes
- Simplify Kubernetes complexity — consolidate configs, eliminate drift
You're a good fit if you:
- Have strong Linux fundamentals and can debug distributed systems
- Code fluently in Python or Go for automation and tooling
- Have real production experience with Kubernetes at scale (not just deployed a cluster once)
- Understand cloud networking, service meshes, and tunneling tech
- Have done serious database backup/restore work in distributed environments
- Ship infrastructure improvements, not just maintain what exists
- Have a GitHub, blog, or portfolio that shows you actually build things
What matters:
- 4+ years running production infrastructure
- Experience with observability tools (Datadog, Prometheus, Grafana)
- CI/CD systems (GitLab/GitHub)
- Strong communication — you'll work across teams
Remote, US-based. Must be a US citizen.