Job Description
Job Description
- Deploy and secure an on-premises AI infrastructure for hosting large language models (LLMs)
- Install, configure, and maintain AI model-serving frameworks on internal GPU-enabled servers
- Develop and maintain robust, scalable APIs to ensure internal access to AI capabilities and seamless integration with enterprise applications and data systems
- Collaborate on the implementation of a Retrieval-Augmented Generation (RAG) pipeline and AI agents to automate business workflows
Requirements
- Bachelor’s degree or higher in computer science, electrical/computer engineering or related field
- Minimum 4 years of experience in systems engineering, DevOps, or MLOps Role
- Proficiency in Linux Server Administration
- Strong working knowledge of GPU-accelerated compute environments
- Proficiency in Python for scripting, automation, and building AI/ML data pipelines
- Experience deploying LLMs or generative AI models in production environments
- Working knowledge of RAG architectures, including vector databases, embedding models, and retrieval strategies.
Benefits
- Health Care Plan (Medical, Dental & Vision)
- Life Insurance (Basic, Voluntary & AD&D)
- Paid Time Off (Vacation, Sick & Public Holidays)
- Training & Development
- Retirement Plan (401k, IRA)