Job Description
Job Description
We are looking for a skilled Software Developer to design, develop, and maintain middleware applications that drive business functionality and enhance user experiences. This long-term contract position requires expertise in creating scalable, high-performing solutions while collaborating with cross-functional teams to ensure seamless integration and operational readiness. Join our team in Philadelphia, Pennsylvania, to contribute to impactful projects and deliver innovative software solutions.
Job Description
We are seeking a talented Data Engineer to join a cutting-edge team supporting machine learning (ML) research efforts. This role focuses on building and optimizing data pipelines and infrastructure to enable more efficient ML model development and deployment. As a key collaborator with ML and NLP teams, you’ll work on infrastructure solutions for faster model building, optimization, and production deployment. If you're a hands-on engineer with strong cloud infrastructure experience, particularly with AWS, and a passion for enabling ML research, we want to hear from you!
Key Responsibilities
- Design, build, and maintain data pipelines and ML infrastructure to support AI/ML research teams.
- Collaborate with ML and NLP scientists to address content retrieval needs and build search-related solutions.
- Create scalable solutions using technologies like Python, Airflow, Databricks, Kubernetes, and AWS.
- Optimize Databricks notebooks using PySpark, structure ML pipelines, and assist with model production workflows.
- Develop and deploy REST APIs for model serving and infrastructure communication.
- Maintain and support Kubernetes clusters for model deployment in production environments.
- Focus on content retrieval and database query solutions, contributing backend querying expertise (Java, Kotlin, Go).
Schedule: 4 days onsite 1 day remote
Key Technologies
- Required:
- Python (Microservices and Data Pipelines).
- Kubernetes (Container orchestration and deployment).
- Airflow (Designing and maintaining data pipelines).
- AWS (Cloud infrastructure expertise).
- Preferred (Not Required):
- Databricks (Workbooks and ML pipeline development).
- PySpark (Notebook optimization).
- Pytorch (For process improvement/distillation).
- Content Retrieval/Database Querying (Java, Kotlin, Go).
What We’re Looking For
- Strong background in AWS infrastructure with the ability to hit the ground running.
- Skills in Kubernetes, especially for deploying models in containerized environments.
- Proficiency in Airflow for robust and scalable data pipeline development.
- Demonstrated expertise in Python, especially for microservices and pipelines.
- Ability to collaborate with ML teams without direct modeling responsibilities (this is ML-adjacent, focused on infrastructure).
- Nice to Have: Experience with content retrieval, backend querying, and ML pipelines.
Disqualifiers
- Candidates heavily focused on machine learning modeling roles (this position is centered on infrastructure).
Why Join Us?
- Work closely with top-tier AI/ML research teams on cutting-edge initiatives.
- Develop tools and infrastructure that drive innovation in AI and NLP.
- Competitive compensation with benefits (health, vision, dental).