Search

Data Engineer

Robert Half
locationWashington, DC, USA
PublishedPublished: 6/14/2022
Technology
Full Time

Job Description

Job Description

We are looking for a skilled Software Developer to design, develop, and maintain middleware applications that drive business functionality and enhance user experiences. This long-term contract position requires expertise in creating scalable, high-performing solutions while collaborating with cross-functional teams to ensure seamless integration and operational readiness. Join our team in Philadelphia, Pennsylvania, to contribute to impactful projects and deliver innovative software solutions.


Job Description

We are seeking a talented Data Engineer to join a cutting-edge team supporting machine learning (ML) research efforts. This role focuses on building and optimizing data pipelines and infrastructure to enable more efficient ML model development and deployment. As a key collaborator with ML and NLP teams, you’ll work on infrastructure solutions for faster model building, optimization, and production deployment. If you're a hands-on engineer with strong cloud infrastructure experience, particularly with AWS, and a passion for enabling ML research, we want to hear from you!


Key Responsibilities

  • Design, build, and maintain data pipelines and ML infrastructure to support AI/ML research teams.
  • Collaborate with ML and NLP scientists to address content retrieval needs and build search-related solutions.
  • Create scalable solutions using technologies like Python, Airflow, Databricks, Kubernetes, and AWS.
  • Optimize Databricks notebooks using PySpark, structure ML pipelines, and assist with model production workflows.
  • Develop and deploy REST APIs for model serving and infrastructure communication.
  • Maintain and support Kubernetes clusters for model deployment in production environments.
  • Focus on content retrieval and database query solutions, contributing backend querying expertise (Java, Kotlin, Go).

Schedule: 4 days onsite 1 day remote

Key Technologies

  • Required:
  • Python (Microservices and Data Pipelines).
  • Kubernetes (Container orchestration and deployment).
  • Airflow (Designing and maintaining data pipelines).
  • AWS (Cloud infrastructure expertise).
  • Preferred (Not Required):
  • Databricks (Workbooks and ML pipeline development).
  • PySpark (Notebook optimization).
  • Pytorch (For process improvement/distillation).
  • Content Retrieval/Database Querying (Java, Kotlin, Go).


What We’re Looking For

  • Strong background in AWS infrastructure with the ability to hit the ground running.
  • Skills in Kubernetes, especially for deploying models in containerized environments.
  • Proficiency in Airflow for robust and scalable data pipeline development.
  • Demonstrated expertise in Python, especially for microservices and pipelines.
  • Ability to collaborate with ML teams without direct modeling responsibilities (this is ML-adjacent, focused on infrastructure).
  • Nice to Have: Experience with content retrieval, backend querying, and ML pipelines.


Disqualifiers

  • Candidates heavily focused on machine learning modeling roles (this position is centered on infrastructure).


Why Join Us?

  • Work closely with top-tier AI/ML research teams on cutting-edge initiatives.
  • Develop tools and infrastructure that drive innovation in AI and NLP.
  • Competitive compensation with benefits (health, vision, dental).
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...