Overview
We are seeking a skilled Machine Learning Ops Engineer with a strong background in Cloud Engineering, particularly within the AWS environment, with a focus on platform engineering.
- Location: Letterkenny – Hybrid 3 days per week in office
- Covering hours 11am – 7.30pm
The Role
The ideal candidate will have a passion for developing robust machine learning platforms and ensuring the seamless deployment and operation of ML models through efficient infrastructure and tooling.
Your responsibilities:
- Lead Platform Engineering Initiatives: Design, build, and maintain scalable, reliable, and efficient ML platforms, ensuring robust performance and operational excellence.
- Collaborate Across Teams: Work closely with infrastructure and DevOps teams to integrate ML models into the broader platform, ensuring seamless operation and scaling.
- Enhance ML Infrastructure: Improve the architecture, scalability, stability, and performance of the ML platform, focusing on AWS cloud engineering solutions and platform services.
- Extend ML Frameworks: Develop and extend existing machine learning libraries and frameworks to enhance model performance and integrate them effectively within the platform.
- Monitor and Govern ML Models: Develop processes for model monitoring and governance, ensuring successful ML model operationalization and compliance with industry standards on the platform.
- Technical Roadmap Ownership: Define objectives for the Machine Learning platform, own the technical roadmap, and be accountable for delivering results that align with platform engineering goals.
- Set Platform Standards: Define and uphold standards for platform engineering and operational excellence, striving to run best-in-class ML platforms and continually improving them to incorporate the latest innovations.
- Architectural Best Practices: Design and implement architectural best practices specifically for platform engineering, ensuring efficient and scalable deployment of ML solutions.
Your Profile
Essential skills/knowledge/experience:
- Strong experience in Cloud Engineering, specifically in AWS services such as EC2, S3, Lambda and SageMaker.
- Proficiency in platform engineering practices and frameworks for machine learning operations (ML Ops).
- Experience with data pipeline tools such as Apache Airflow, AWS Glue, or similar.
- Strong programming skills in Python or other relevant languages.
- Experience with DevOps tools such as Git, Jenkins, GitHub Actions, etc.
- Familiarity with Groovy is an added advantage.
- Experience with containerization and orchestration tools like Docker and Kubernetes.
- Knowledge of Azure OpenAI, Langgraph, Bedrock will help.
- Excellent problem-solving and analytical skills.
- Strong communication skills and ability to work collaboratively in a team environment.
Desirable skills/knowledge/experience:
- AWS Certified Solutions Architect or AWS Certified Machine Learning specialty.
- Experience with CI/CD tools and practices.
- Familiarity with big data technologies such as Apache Spark or Hadoop.
Candidates must be eligible to work full time and long term in the location specified or currently hold a valid appropriate long term work Visa to apply.
eir evo talent, eir evo and our clients are equal opportunity employers who seeks to recruit and appoint the best available person for a job regardless of marital / civil partnership status, sex (including pregnancy), age, religion, belief, race, nationality and ethnic or national origin, colour, sexual orientation or disability. eir evo talent, eir evo and our clients apply all relevant Data Protection laws when processing your Personal Data.
If you choose to apply to this opportunity and share your CV or other personal information with eir evo talent, eir evo and our clients, these details will be held by us in accordance with our privacy policy used by our recruitment team to contact you regarding this or other relevant opportunities at eir evo talent and eir evo