Job Description
Job DescriptionSalary:
About Us:
duPont REGISTRY Group is a dynamic and innovative company dedicated to deliveringexceptional solutions to our clients. We are seeking a highly skilled and versatile DevOps Engineer to join our growing team. This role is hands-on and deeply technical ideal for someone experienced in infrastructure automation, observability, and microservices orchestration AWS cloud environments.
Responsibilities
- Design, provision, and manage cloud infrastructure using Pulumi and Infrastructure-as-Code best practices.
- Deploy and operate EKS clusters with keeping solid security, observability, cost efficiency compliant to best practices
- Manage Kubernetes workloads based on Helm, kustomize or custom Kubernetes operators
- Implement and maintain observability stacks VictoriaMetrics, VictoriaLogs, Grafana, Alertmanager, Zipkin, and OpenTelemetry for Node.js, python and golang based services.
- Build custom dashboards and alert rules for infrastructure and application performance monitoring.
- Automate operations and writing infrastructure-oriented code on Typescript or Golang.
- Administer GitLab CI/CD pipelines, runners, and environments for seamless multi-repo workflows.
- Support backend teams by integrating Node.js, Golang and Python microservices into unified deployment and monitoring pipelines.
- Apply strong security and compliance practices (RBAC, IAM, secrets management, mTLS).
- Contribute to disaster recovery, cost optimization, and reliability engineering strategies.
Required Skills and Qualifications
General skill
- System design, architecture and patterns understanding
- Proven experience in DevOps/SRE position
- Deep understanding of Linux basics and containers (cgroups, cpu scheduling, kernel parameters)
- OCI containers (docker/podman)
- Image build tools (buildkit, kaniko, buildah)
- Networking knowledge, according to CCNA level
- Backup and restore strategies, disaster recovery planning
Programming languages and frameworks
- JavaScript/Typescript basic level
- Understanding of NodeJS runtime
- Golang basics
- REST API implementations
- SOLID principles
- 12-factor app methodology
- Skills of bash scripting and perception, when bash should be replaced to programming language
Git, IaC, CI/CD
- Advanced git knowledge
- Gitlab CI/CD experience
- Pulumi basics and especially typescript implementation
- Understanding of cattle/pets infrastructure approaches,
Cloud technologies
- Experience with AWS basic services (EC2, S3, RDS, VPC, IAM, ALB)
- EKS clusters with worker nodes and networking administration, understanding of transparent IAM security (IRSA, Pod execution roles)
- Cloudfront cdn, WAF security
Kubernetes
- Understating of core concepts, such pods, replicasets, deployments, services, ingresses, secrets
- Experience with different deployment tools and concepts, such Helm, kustomize (overlays concept), ArgoCD/Flux, Pulumi Kubernetes provider
- Statedul/stateless concepts, difference between deployments and statefulsets, storage classes
- Internal Kubernetes networking, CNI like Cilium, AWS CNI, Flannel
- Cluster monitoring via Prometheus/VictoriaMetrics, logging via Loki/ViactoriaLogs
- Logging agents vector, fluentbit
Security
- Source code and containers images security scanning trivy, SonarCube, Artifactory
- OAuth and SAML implementations protocols, SSO concept
- SSL/TLS understanding
- Goteleport access security
- Keycloak IDM
Monitoring and Observability
- Understanding difference between metrics, logs, tracing
- Prometheus-like metrics solutions, Grafana dashboards
- Logging platforms based on Loki/VoctoriaMetrics/Graylog
- Alerting with alertmanager
- Distributed tracing systems like Zipkin, ElasticSearch APM
Bonus Points for Experience with
- Kubernetes service mesh (Istio, Linkerd)
- Kubernetes operator pattern (Kubebuilder, Operator SDK)
- OpenSearch and Redis cluster administration.
- RabbitMQ administration
- NoSQL solutions MongoDB, DynamoDB
- Policy-as-Code (OPA, Kyverno) or security automation (Falco, Trivy, Snyk).
- Serverless architecture (AWS Lambda)
- Cost management, auto-scaling, and infrastructure benchmarking.
- Chaos engineering or resilience testing practices.