Senior RL (Reinforcement Learning) Engineer – San Francisco, CA

Permanent
San Francisco, 94111
AI RL Research Engineering
$400k pa + $500k pa + equit
swx41187

Senior Reinforcement Learning Engineer San Francisco, CA (FiDi) | On-site 5 days | Full-time $300,000 — $500,000 + Equity

A small, elite AI research team working on reinforcement learning in open-ended settings. The team includes researchers from leading PhD programmes and tier-one AI organisations. Early-stage, well-resourced, and moving quickly — this is a genuine ground-floor opportunity with significant scope and impact.

The Role

Senior RL Engineer to sit at the intersection of research and production engineering. You will own the translation of RL research ideas into reliable, measurable training systems and drive technically complex projects end to end.

Key Responsibilities

Build and improve RL training pipelines for language model-based agents
Implement reward functions, verifiers, environment interfaces, rollout pipelines, and evaluation harnesses
Design experiments to test whether RL methods are improving model behaviour, sample efficiency, robustness, or generalisation
Build monitoring tooling: regression tests, eval suites, and reward-hacking checks
Debug unstable training runs and diagnose learning dynamics failures across algorithms, rewards, data, infrastructure, and evals
Manage GPU clusters, distributed training, and compute efficiency
Build 0-to-1 systems for new RL workflows and harden them into reusable infrastructure
Own ambiguous technical problems from problem framing through to delivery

Requirements

Strong applied ML engineering background: shipped systems, open-source work, competitions, or early-stage startup experience
Hands-on experience scaling RL pipelines and debugging training issues
Familiarity with RL environments and large language models; diffusion model experience a plus
Python proficiency and strong working knowledge of PyTorch or JAX
Solid grounding in RL, supervised learning, optimisation, and modern deep learning
Independent, intellectually curious, and able to drive ambiguous problems to working solutions
Comfortable collaborating with researchers while holding high engineering standards
PhD not required — strong applied experience equally valued

Package

Visa sponsorship available (H1B transfer, TN, OPT, O-1); existing US work authorisation preferred

Facebook

Twitter