Member of Engineering – Reinforcement Learning Infrastructure

Posted 2hrs ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Working on the reinforcement learning team to improve reasoning and coding abilities of Large Language Models. Access to thousands of GPUs and hands-on role in applied research and infrastructure.

Responsibilities:

  • Keep up with the latest research, and be familiar with the state of the art in LLMs, RL, and code generation
  • Develop methods for tuning training and inference end-to-end for high throughput
  • Design data control systems in an RL pipeline that govern what the model sees and when
  • Debug cases where infrastructure decisions are silently degrading learning dynamics
  • Build observability tooling that surfaces when a system-level issue is the root cause of a training regression
  • Help build robust, flexible and scalable RL pipelines
  • Optimize performance across the stack — networking, memory, compute scheduling, and I/O
  • Write high-quality, pragmatic code
  • Work in the team: plan future steps, discuss, and always stay in touch

Requirements:

  • Experience with LLMs and model post-training workflows
  • Understanding how Reinforcement Learning works and what its main bottlenecks are
  • Solid software engineering fundamentals (testing, code review, debugging complex systems)
  • Proficiency in Python with knowledge of concurrency, asynchronous programming, multiprocessing and performance optimization
  • Familiarity with deep learning frameworks (PyTorch or JAX) and RL workflows (rollouts, replay buffers, policy updates)
  • Experience designing and maintaining distributed RL training systems
  • Experience with large-scale LLM training infrastructure
  • Experience with profiling tools across the stack (e.g. py-spy)
  • Experience with inference stacks (e.g. vLLM)
  • Nice to have: Open-source contributions to RL or distributed ML projects

Benefits:

  • Fully remote work & flexible hours
  • 37 days/year of vacation & holidays
  • Health insurance allowance for you and dependents
  • Company-provided equipment
  • Wellbeing, always-be-learning and home office allowances
  • Frequent team get togethers
  • Great diverse & inclusive people-first culture