Research Intern – Applied Reinforcement Learning

Posted 1hrs ago

Employment Information

Industry
Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

PhD Research Intern designing and evaluating reinforcement learning systems for agentic AI workflows at Centific. Involves developing RL environments and post-training pipelines for practical enterprise solutions.

Responsibilities:

  • Design and evaluate reinforcement learning (RL) systems for agentic AI workflows
  • Develop RL environments, reward models, and post-training pipelines for LLM-based agents
  • Create end-to-end RL pipelines for agentic systems (simulation → training → evaluation)
  • Align LLM-based agents using RLHF, DPO, PPO, and emerging methods
  • Design reward functions, verifiers, and evaluation frameworks
  • Build simulation environments (digital twins) for enterprise workflows
  • Ensure scalable training and inference for RL-based systems
  • Document experiments, ablations, and findings for research and productionization

Requirements:

  • PhD candidate in CS, ML, or related field with research in reinforcement learning or agentic AI
  • Strong Python and PyTorch skills with GPU-based training experience
  • Solid understanding of RL fundamentals (MDPs, policy gradients, value methods)
  • Experience with LLMs and post-training techniques (RLHF, DPO, PPO, etc.)
  • Strong experimentation practices (ablation, reproducibility, clear reporting)
  • Experience with RL environments (Gymnasium, RLlib, Stable Baselines) (preferred)
  • Research in offline RL, model-based RL, or hierarchical RL (preferred)
  • Publications at top ML conferences (NeurIPS, ICML, ICLR, ACL) (preferred)
  • Experience with simulation, synthetic data, or multi-agent systems (preferred)
  • Distributed training and large-scale experimentation (preferred)

Benefits:

  • Competitive stipend
  • Mentorship from researchers and engineers
  • Access to modern GPU infrastructure
  • Opportunities to publish and present research