AI Safety Researcher - Red Team Lead

reflection • United State

Visa Sponsorship

Apply

AI Summary

Own the red-teaming and adversarial evaluation pipeline for Reflection's open AI models, identifying security, misuse, and alignment vulnerabilities. Work closely with the Alignment team to develop safety guardrails and validate model releases against risk thresholds. Require deep technical expertise in LLM safety, adversarial attacks, and automated evaluation pipeline development.

Key Highlights

Own red-teaming and adversarial evaluation pipeline for open AI models

Validate model releases against risk thresholds as critical gatekeeper

Develop scalable automated safety benchmarks and jailbreaking defenses

Key Responsibilities

Own the red-teaming and adversarial evaluation pipeline for Reflection's models, continuously probing for failure modes across security, misuse, and alignment gaps

Work hand-in-hand with the Alignment team to translate safety findings into concrete guardrails ensuring models behave reliably under stress

Validate that every release meets the lab's risk thresholds before it ships, serving as a critical gatekeeper for open weight releases

Develop scalable, automated safety benchmarks that evolve alongside model capabilities beyond static datasets to dynamic adversarial testing

Research and implement state-of-the-art jailbreaking techniques and defenses to stay ahead of potential vulnerabilities in the wild

Technical Skills Required

LLM safety Adversarial attacks Software engineering

Benefits & Perks

Top-tier compensation with salary and equity

Stock options for all contributors

Comprehensive health, dental, vision, and life insurance with wellness allowance

Nice to Have

Experience with Reinforcement Learning (RLHF/RLAIF) and how it impacts model safety and alignment

Job Description

Our Mission

Reflection is a research lab making intelligence open and accessible for everyone to use, customize, and build on. We build open models that let anyone control their intelligence and help shape the future of AI. Our mission: make intelligence open and accessible to all.

About The Role

Own the red-teaming and adversarial evaluation pipeline for Reflection’s models, continuously probing for failure modes across security, misuse, and alignment gaps.
Work hand-in-hand with the Alignment team to translate safety findings into concrete guardrails, ensuring models behave reliably under stress and adhere to deployment policies.
Validate that every release meets the lab’s risk thresholds before it ships, serving as a critical gatekeeper for our open weight releases.
Develop scalable, automated safety benchmarks that evolve alongside our model capabilities, moving beyond static datasets to dynamic adversarial testing.
Research and implement state-of-the-art jailbreaking techniques and defenses to stay ahead of potential vulnerabilities in the wild.

Searching for Development & Programming roles that provide visa sponsorship? Connect with international employers through Development & Programming Jobs with Visa Sponsorship opportunities actively seeking talented professionals.

About You

Graduate degree (MS or PhD) in Computer Science, Machine Learning, or related discipline, or equivalent practical experience in AI Safety.
Deep technical understanding of LLM safety, including adversarial attacks, red-teaming methodologies, and interpretability.
Strong software engineering capabilities with experience building automated evaluation pipelines or large-scale ML systems.
Experience with Reinforcement Learning (RLHF/RLAIF) and how it impacts model safety and alignment is a strong plus.
Thrive in a fast-paced, high-agency startup environment with bias toward action.

Explore our comprehensive directory of visa sponsorship jobs from employers worldwide who are ready to sponsor talented international professionals.

Willing to make high-stakes decisions regarding model release and safety thresholds.
Passionate about advancing the frontier of intelligence.

What We Offer

We believe that to make intelligence open and accessible to all, you need to start at the foundation. Joining Reflection means building from the ground up as part of a talent-dense team. You will help define our future as a company, and help define the future of open foundational models.

We want you to do the most impactful work of your career with the confidence that you and the people you care about most are supported.

Interested in opportunities specifically in United State? Discover our dedicated Visa Sponsorship Jobs in United State page featuring roles from top employers in this location.

Top-tier compensation: Salary and equity structured to recognize and retain our talent globally.
Stock options: Everyone who joins and contributes to Reflection's success gets to share in the upside through stock options.
Health & wellness: Comprehensive medical, dental, vision, and life, with an annual wellness allowance.
Meals: Lunch and dinner are provided in the office daily.
Life & family: 22 weeks paid parental leave for all new birthing and non-birthing parents, including adoptive and surrogate journeys.
Vacation days: Unlimited paid time off in the U.S. and 30 days in the U.K.
Sponsorship support: We sponsor visas to help exceptional talent join our team and support long-term immigration pathways where applicable.
Team building: We have regular off-sites, happy hours, and team celebrations.

Job Overview

Posted Date Jul 04, 2026

Employment Type Full-time

Experience Level Not Applicable

Location United State

Category Programming

Company reflection

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Senior ML Infrastructure Engineer - Large-Scale AI Platform

Programming

•

2h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Not Applicable

Jobgether

United State

Senior Engineering Manager, Enterprise AI Products

Programming

•

12h ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

Palo Alto Networks

United State

Senior Reporting Engineer - Data-Driven Insights & Customer-Facing Deliverables

Programming

•

12h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

Stealth Startup

United State

AI Safety Researcher - Red Team Lead

Key Highlights

Key Responsibilities

Technical Skills Required

Benefits & Perks

Nice to Have

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Senior ML Infrastructure Engineer - Large-Scale AI Platform

Jobgether

Senior Engineering Manager, Enterprise AI Products

Premium Job

Palo Alto Networks

Senior Reporting Engineer - Data-Driven Insights & Customer-Facing Deliverables

Stealth Startup

Subscribe our newsletter