Job Description
Job DescriptionJob Description:
We are seeking an experienced Cloud Infrastructure Splunk Specialist to join our Datacenter Engineering team. You will play a key role in managing cloud infrastructure, security, automation, billing dashboards, and Splunk analytics. The role requires deep experience with Splunk (Splunk Enterprise, SPL/SPLUNK SEARCH PROCESSING LANGUAGE), cloud platforms (AWS, Azure, GCP, OCI), infrastructure-as-code (Terraform, CloudFormation), container orchestration (Kubernetes, Docker, AWS Fargate), scripting (Python, Bash, PowerShell), and monitoring/observability tools (Prometheus, Grafana, CloudWatch, Azure Monitor). Experience with security tooling (Chef InSpec/Automate, TrendMicro Deep Security, CyberArk, Tripwire), CI/CD, and cloud cost optimization (FinOps) is highly desirable.
Responsibilities:
-
Design, develop, and maintain Splunk Dashboards for monitoring and reporting, including creating advanced SPL queries, saved searches, alerts, and visualizations.
-
Perform trend and data analysis on cloud resources, infrastructure, and billing systems to identify optimization and cost-savings opportunities.
-
Oversee data collection systems and Cloud CLI scripting for automation (AWS CLI, Azure CLI, gcloud, OCI CLI). Manage Splunk forwarders, indexers, clustering and ingestion pipelines.
-
Manage and maintain Cloud Billing Dashboards across AWS, Azure, Oracle Cloud, and Google Cloud, integrating billing APIs and usage reports.
-
Integrate and monitor billing APIs from multiple cloud providers and implement cost-allocation, tagging, and FinOps best practices.
-
Collaborate with security teams to ensure infrastructure integrity and compliance (IAM, vulnerability management, SOC2/PCI readiness).
-
Perform proactive monitoring, troubleshooting, and resolution of system issues using observability tooling (Prometheus, Grafana, ELK, Splunk).
-
Work closely with cross-functional teams to support cloud automation and containerized workloads, contributing to IaC templates (Terraform/CloudFormation), CI/CD pipelines (Jenkins/GitHub Actions/GitLab CI), and deployment automation.
-
Provide documentation, runbooks, best practices, and recommendations for infrastructure and monitoring improvements; ensure reproducible and auditable configurations (YAML, JSON).
-
Ensure continuous improvements in monitoring, automation, and reporting systems, including alert tuning, capacity planning and performance optimization.
Requirements
-
8–10 years of hands-on experience in Wintel Administration and enterprise datacenter/cloud operations.
-
Strong understanding of host-based firewalls, intrusion protection, data integrity, and vulnerability scanners; hands-on with EDR/IDS/IPS tools and vulnerability remediation workflows.
-
Expertise in AWS and Azure security concepts including IAM, KMS, VPC/Networking, security groups, and cloud security best practices.
-
Hands-on experience with Chef InSpec / Automate, TrendMicro Deep Security, CyberArk, and Tripwire and integration of these tools into monitoring/alerting systems.
-
Working knowledge of containers (Kubernetes, AWS Fargate, Docker) and related ecosystem tools (Helm, Kustomize).
-
Strong Splunk skills with proven ability to create searches, reports, dashboards, manage indexing, clustering, and optimize ingest pipelines.
-
Ability to diagnose and resolve technical issues efficiently under pressure, including root cause analysis and incident response.
-
Excellent communication skills (verbal written) with cross-team collaboration and documentation capabilities.
-
Familiarity with cloud billing systems and cost-optimization practices, including experience with billing APIs and cost reporting tools.
-
Self-starter with proactive approach and thorough documentation skills; experience with Git-based workflows and ticketing systems.
Preferred Qualifications:
-
Experience across Azure, AWS, OCI, and GCP, with multi-cloud operational experience.
-
Familiarity with DevOps practices and automation workflows, IaC and CI/CD integration.
-
Exposure to enterprise-scale datacenter engineering and hybrid environments.