Associate ML Ops Specialist (Remote)

Remotely
Full-time
Part-time

Key Responsibilities

- Manage the end-to-end machine learning model lifecycle, from data ingestion and training to production deployment.

- Implement and refine CI/CD pipelines for machine learning to ensure rapid, reliable model releases.

- Containerize ML applications with Docker and manage them in production using orchestration tools like Kubernetes.

- Monitor model performance, system health, and data integrity with tools like Prometheus and Grafana to proactively resolve issues.

- Collaborate with data scientists to provide the necessary infrastructure and tooling for model development.

- Automate operational tasks and infrastructure provisioning via Python scripting and Infrastructure as Code (IaC) tools like Terraform.

- Document workflows, system architectures, and operational procedures for team clarity and system maintainability.


Core Qualifications

- Bachelor’s degree in Computer Science, Engineering, Data Science, or a related technical field.

- Foundational programming skills in Python for scripting and automation.

- Knowledge of core MLOps concepts, including the ML lifecycle, model deployment, and performance monitoring.

- Academic or project-based exposure to containerization (Docker) and orchestration (Kubernetes).

- Proficiency in Linux/Unix environments and the command-line interface.

- Strong communication skills to articulate technical concepts clearly to stakeholders.

- Experience with at least one major cloud platform (AWS, GCP, Azure) is a significant plus.