Work Type: Contractor | Permanent Remote
Compensation: USD 50 – 125/hour
Hours: 10 to 40 hours/week (Partial PST overlap required)
Experience Required: 5 – 10 Years
Contract Duration: 1 Month (Extension based on performance)
Notice Period: Immediate preferred
Note
We're seeking skilled Fullstack Engineers to join cutting-edge AI projects focused on enhancing the performance of Large Language Models (LLMs) in real-world software tasks. Your work will help train, test, and validate LLMs by evaluating AI-generated code, contributing to agent-based applications, and collaborating with a high-performance team of engineers and researchers.
You will play a key role in building datasets, testing model responses, and pushing AI systems closer to real developer productivity.
Key Responsibilities
Compensation: USD 50 – 125/hour
Hours: 10 to 40 hours/week (Partial PST overlap required)
Experience Required: 5 – 10 Years
Contract Duration: 1 Month (Extension based on performance)
Notice Period: Immediate preferred
Note
- This is a contract-based, fully remote role.
- Only citizens or valid work permit holders from the US, Canada, Australia, or Western Europe are eligible.
- No medical benefits or paid leave.
- Contractors must manage their own taxes and compliance.
- Payment is based on actual hours worked.
We're seeking skilled Fullstack Engineers to join cutting-edge AI projects focused on enhancing the performance of Large Language Models (LLMs) in real-world software tasks. Your work will help train, test, and validate LLMs by evaluating AI-generated code, contributing to agent-based applications, and collaborating with a high-performance team of engineers and researchers.
You will play a key role in building datasets, testing model responses, and pushing AI systems closer to real developer productivity.
Key Responsibilities
- Contribute to LLM-focused projects that evaluate AI performance on realistic software engineering tasks
- Build and lead agent use cases such as coding copilots, creative tools, or automation bots
- Review and rank 3–4 AI-generated code solutions per task using a structured framework
- Analyze code diffs for accuracy, efficiency, and readability
- Construct fullstack tools for data pipeline support and internal testing environments
- Identify and report edge cases in model outputs, providing well-structured rationale
- Work closely with researchers and engineers to improve model behavior and code quality
- 5+ years of hands-on software engineering experience, including strong fullstack development
- 1+ years full-time (FTE only) at a top 50 tech company if US-based, 2+ years if located outside the US
- Deep knowledge of software design, code review, debugging, and scalable systems
- Proven expertise in building production-grade applications using modern frameworks
- Excellent communication skills for writing clear evaluation rationales
- Proficient in Git workflows, JavaScript/Python, and cloud platforms (AWS, GCP, etc.)