Senior Infrastructure & DevOps Engineer
Job Description
Job Description
Overview
To build and maintain the automated production line for PHIN's Physical Superintelligence. You will own the plumbing that allows our simulation engine to seamlessly scale, ensuring that our team can deploy updates multiple times a day and ingest massive amounts of simulation data without friction.
Core Responsibilities
-
Greenfield Observability: Architect and implement a comprehensive logging, monitoring, and alerting stack across our platform from the ground up.
-
Compute Architecture, Scaling & FinOps: Provision, manage, and optimize highly concurrent scaling clusters. Act as a cloud-agnostic thinker to direct future architecture and implement rigorous FinOps practices to minimize the cost of running thousands of simultaneous jobs.
-
Infrastructure as Code (IaC): Own, maintain, and expand our Terraform footprint.
-
Continuous Deployment (CD): Design and maintain high-velocity CI/CD pipelines supporting multiple deployments per day. Ensure "code to production" is a seamless, automated journey.
-
Backend Robustness: Manage the API layer that sits between the infrastructure and the application layer. Read and refactor services to optimize data movement, squash bottlenecks, and maintain security.
-
Data Pipeline Architecture: Build the underlying pipelines to move, store, and process the massive datasets generated by atomic-scale simulations.
-
Platform DevEx & MLOps: Build self-serve tooling and event-driven pipelines that empower the entire organization. Create seamless abstractions so our developers can focus on what they do best.
-
DevOps & Intelligence Automation: Ruthlessly automate manual toil. Use and build AI-driven tools to manage logs, infrastructure provisioning, and business workflows.
-
Standard Enterprise Security: Implement and maintain security best practices (SOC2/ISO focus) required for enterprise-grade contracts.
Candidate Profile
-
Experience: 5–8 years as a high-output Individual Contributor in Infrastructure or Backend roles.
-
Generalist Capability: Comfortable touching any part of the system—from networking and security to API design and data engineering. Familiarity with Python and TypeScript/Node.js.
-
Cloud & HPC Familiarity: Deep experience with major cloud providers. Familiarity with high-performance computing (HPC) schedulers like Slurm is a major plus.
-
Tool Agnostic: Not married to one framework; you choose the best tool for the job (K8s, Serverless, HPC Schedulers, etc.).
-
AI-Native: Expert user of intelligence tools (Claude, Cursor, Codex, Copilot, Agents, etc.) to 10x your own productivity and automate business tasks.
-
ML Collaboration: Previous experience working closely with machine learning teams, supporting ML workflows, or building MLOps pipelines is highly desirable.
Application Questionnaire
Please provide concise answers to the following questions as part of your application:
-
Deployment at Scale: We aim for multiple deployments per day for a compute-heavy simulation engine. How do you design a CI/CD pipeline that ensures high velocity without sacrificing the stability of long-running HPC jobs?
-
API Ownership: As the manager of the API layer, how do you approach versioning and contract management when the underlying physics models are changing rapidly?
-
The "Generalist" Test: If you were tasked with building a data pipeline for 100,000 daily simulations on a cloud provider you’ve used less frequently (e.g., Azure vs AWS), what is your step-by-step process for getting it production-ready in a week?
-
Automation & AI: How have you applied intelligence tools to your own job? Give a specific example of an internal tool or agent you’ve built to automate your DevOps or Backend workflows.
-
Standard Security: What are the top three "low-hanging fruit" security implementations you would prioritize to prepare a startup’s infrastructure for a $10M enterprise audit?
Values Alignment: Rate yourself 1-5 on the embodiment of the following values and give a 1 sentence rationale for each: People-first, Constant Improvement, Efficiency, High Quality, Visionary