LLM Evaluation Engineer
EU engineers, ready to place with your US clients
Pre-screened on AI. Remote B2B contracts. View 5 full profiles free — AI score, skills report, interview questions included.
About This Role
Requirements
- Develop and maintain LLM evaluation metrics and benchmark datasets
- Design A/B testing frameworks for model performance comparison
- Write clean, production-grade Python code for evaluation pipelines
- Analyze model outputs for bias, hallucination, and accuracy across recruitment use cases
- Collaborate with ML engineers to implement evaluation findings into model improvements
- Document evaluation methodologies and create clear performance reports
Required Skills
Pre-screened Candidates
0No candidates available for review yet.
All profiles are anonymized for fair evaluation
Similar Positions
Candidates may also fit these roles
Chatbot Developer
2 matchedRemote
We're hiring a Chatbot Developer to build conversational AI systems that power our AI-first recruiting platform. You'll design, develop, and optimize intelligen…
AI Consultant
2 matchedRemote
Join our AI-first recruiting platform as an AI Consultant and drive intelligent automation across our B2B SaaS product. You'll architect and implement machine l…
AI Application Developer
1 matchedRemote
We're seeking an AI Application Developer to build intelligent, production-ready AI features for our recruiting platform. You'll work on integrating large langu…
AI Workflow Automation Engineer
1 matchedRemote
We're seeking an AI Workflow Automation Engineer to design and implement intelligent automation solutions for our AI-first recruiting platform. You'll build sca…
