Position: Generalist Evaluator Expert
Type: Hourly contract
Compensation: $35-$40 per hour
Location: Remote
Commitment: At least 20 hours per week
Role Responsibilities
- Author prompt–golden answer pairs to train and evaluate advanced language models
- Create detailed prompts with multiple constraints and instructions
- Establish expectations for correct responses in general consumer contexts and develop comprehensive rubrics
- Run prompts through models and assess outputs against defined expectations
- Collaborate in QA review processes to ensure prompt tasks and rubrics meet rigor and maintain consistency before integration into official benchmarks
Requirements
- BS or BA from a reputable institution, completed or in progress
- Strong writing and critical thinking skills
- Ability to work independently and meet deadlines
- Familiarity with ChatGPT or similar tools for personal decision-making or general interests
- Experience in teaching or research preferred
Application Process (Takes 20 Mins)
- Upload resume
- Interview (15 min)
- Submit form