Senior Site Reliability Engineer
Job Description
Job Description
Overview
CTG is seeking to fill a Senior Site Reliability Engineer opening for our client in Jersey City, NJ.
Location: Jersey City, NJ (100% Onsite)
Duration: 6 months
Duties:
-
Administer and optimize Kubernetes clusters (Amazon EKS and Red Hat OpenShift), including upgrades, scaling, and security controls.
-
Manage middleware platforms such as Apache Kafka, Redis Enterprise Clusters, and 3 Scale API Gateway.
-
Automate manual operations using Infrastructure-as-Code (IaC) and configuration management tools (Helm, ArgoCD, Terraform, Ansible, etc.).
-
Design and implement monitoring dashboards and alerts with Prometheus, Grafana, ELK, and Splunk.
-
Instrument distributed applications (Java, Node.js, Python) to meet SLOs with tracing, metrics, and logging.
-
Define SLIs/SLOs, manage error budgets, and lead incident response and root cause analysis.
-
Forecast capacity, monitor utilization, and tune performance of applications and clusters.
-
Implement container security and policy governance with tools such as OPA/Gatekeeper, Kyverno, Trivy, Clair, and Snyk.
-
Configure Kubernetes network segmentation (NetworkPolicy, Calico) to secure traffic and enforce reliability.
Skills:
-
Strong hands-on expertise with Kubernetes (EKS and/or OpenShift), Helm charts, and Operators.
-
Middleware expertise with Kafka and Redis.
-
IaC and automation proficiency with Terraform, Ansible, ArgoCD, Helm, or similar tools.
-
Advanced observability experience with Prometheus, Grafana, ELK/Splunk.
-
Programming and scripting skills in Python, Shell, or Groovy.
-
Proficiency in instrumenting distributed applications for observability.
-
Ability to enforce and maintain high reliability standards using SLO-driven frameworks.
-
Strong debugging, analytical, communication, and collaboration skills.
Experience:
-
12+ years overall industry experience.
-
6+ years in SRE, DevOps, Platform, or Production Engineering roles.
-
Proven track record of managing large-scale production systems with high availability.
-
Certification in EKS/OpenShift administration (CKA, AWS Certified Kubernetes Administrator, Red Hat Certified OpenShift Administrator, or equivalent) preferred.
-
Nice-to-have: experience with service mesh (Istio, Linkerd), chaos engineering (Chaos Monkey, LitmusChaos), regulated environment security/compliance, and API Gateway platforms (e.g., RedHat 3 Scale).
Education:
-
Bachelor’s degree in Computer Science, Information Technology, or a related field preferred. Equivalent work experience may be considered.
Excellent verbal and written English communication skills and the ability to interact professionally with a diverse group are required.
CTG does not accept unsolicited resumes from headhunters, recruitment agencies, or fee based recruitment services for this role.
To Apply:
To be considered, please apply directly to this requisition using the link provided. For additional information, please contact Rebecca Olan at Rebecca.Olan@ctg.com. Kindly forward this to any other interested parties. Thank you!
The expected base salary for this position ranges from $140,000 to $160,000. Salary offers are based on a wide range of factors including relevant skills, training, experience, education, market factors, and where applicable, licensure or certifications obtained. In addition to salary, a competitive benefit package is also offered.