RL Environments Engineer – Low-Level Engineering & Kernel Inference Optimization

Full-Time
, Remote

Remote · Contractor

Are you a low-level systems engineer who wants to teach AI models how to write truly high-performance code?

Join a team building the next generation of training environments that push large language models beyond their current limits—down to kernels, compilers, and hardware-aware optimization.

What you’ll be working on?

We are building realistic reinforcement learning environments where large language models encounter real research and engineering problems, iterate on solutions, and learn from high-fidelity feedback loops.

In this role, you’ll design and implement RL environments that teach models low-level systems skills, including:

  • Kernel development and optimization across GPU and CPU architectures
  • Hardware-aware performance tuning and memory-efficient computation
  • Compiler, JIT, and AOT optimization workflows
  • High-throughput inference and distributed systems behavior

These environments are developed in collaboration with leading AI labs and directly influence how frontier models learn to write efficient, production-grade systems code.

Your Role

As an RL Environments Engineer, you will:

  • Design and build realistic, high-signal RL environments focused on low-level engineering tasks
  • Implement evaluation and feedback mechanisms that reflect real performance constraints (latency, memory, throughput, correctness)
  • Translate complex systems problems into learnable, iterative tasks for LLMs
  • Work hands-on with kernel code, compilers, and inference systems to define what “good” looks like
  • Iterate quickly based on model behavior, experiment results, and partner feedback

This is a remote contractor role, requiring at least 4 hours of overlap with PST and advanced English proficiency (C1/C2).

About Us

We are building the next generation of training data and environments to power the future of AI.

While today’s models are powerful, they struggle with out-of-distribution tasks—especially in complex engineering and research domains. Our work focuses on creating environments where models learn by doing: encountering real problems, receiving realistic feedback, and improving through iteration.

Our team has deep experience building large-scale data infrastructure, tokenizers, and training datasets for state-of-the-art language models. We collaborate closely with leading AI labs to push models closer to their transformative potential.

MPI 5206

What are we looking for?

  • Production mindset: You care about correctness, debuggability, performance, and iteration speed—not just prototypes.
  • Strong engineering fundamentals: You write clean, reliable Python and are comfortable working close to the metal when needed.
  • LLM intuition: You understand how current models behave, where they fail, and how training signals shape outcomes.
  • Responsiveness and ownership: You can meet throughput expectations and respond quickly to feedback in a fast-moving research environment.
  • Clear communication: You can explain complex systems concepts clearly and collaborate effectively with researchers and engineers.

Requirements

Minimal qualifications:

  • Strong Python skills (engineering-quality code, not notebook-only)
  • Experience building, debugging, and maintaining production systems
  • Clear understanding of current LLM capabilities and limitations
  • Ability to work independently and iterate quickly in a feedback-driven setup

You may be a great fit if you have experience with:

  • GPU and CPU memory hierarchies, threading models, and performance tuning
  • Kernel development and optimization (CUDA, HIP/ROCm, or similar)
  • Compiler, JIT, or AOT frameworks (e.g., Triton, XLA, LLVM/MLIR, TVM)
  • Low-level systems programming in modern C++ and/or assembly
  • PyTorch internals, custom operators, or inference system optimization
  • Distributed inference, collectives, or GPU communication libraries
  • Mixed- and low-precision computation (FP16/BF16/FP8/INT8)
DJI 0120

What do we offer?

  • Fully remote, contractor engagement
  • Work on cutting-edge problems at the intersection of AI, compilers, and hardware
  • Direct collaboration with researchers and engineers shaping frontier AI systems
  • High autonomy, fast iteration, and meaningful technical ownership
  • Competitive compensation aligned with senior, specialized expertise

What does the interview process look like?

  1. Initial Interview: A focused conversation to discuss your background, low-level expertise, and experience working on performance-critical systems.
  2. Technical Deep Dive: A practical, in-depth discussion or exercise centered on kernel optimization, systems reasoning, or environment design—tailored to your strengths.

Regardless of the outcome, we aim to provide clear, constructive feedback.

Stop Drowning in AI Hype

Get weekly insights from 50+ practitioners implementing AI in real businesses

Why You’ll Love It: