Reinforcement Learning Jobs

How to use Reinforcement Learning in applications

If a job requires Reinforcement Learning, the skill should be supported by a project, course or portfolio example. The application check reviews whether the skill is actually evidenced in your CV.

Document a small project or notebook
Name relevant tools and methods
Connect the evidence to the target role

Results

Companies

94.3

Average score

Remote

10 results on this page. 37 results in total. More results are available via pagination, company pages, skill pages and job detail pages.

Machine Learning Engineer, AI Assistant & Autonomous AI Agents

Glean · Mountain View, CA (HQ)

98/100

Mountain View, CA (HQ)USAgreenhouse2026-06-08

Why this is a real AI job: Die Rolle ist klar auf die Entwicklung und Optimierung von LLM-basierten Agenten, Reinforcement Learning, Evaluationsrahmen und agilen Architekturen ausgerichtet. KI ist der überwiegende Kern der Tätigkeit, sowohl in Forschung als auch in Produktionsumgebung.

About Glean: Glean is the Work AI platform that helps everyone work smarter with AI. What began as the industry’s most advanced enterprise search has evolved into a full-scale Work AI ecosystem, powering intelligent Search, an AI Assistant, and scalable AI agents on one secure, open platform. With…

Details Open source / apply

Sr. Staff AI/ML Engineer

Gusto · United States - Denver, CO, United States - Los Angeles, CA - Remote, United States - New York, NY, United States - San Francisco, CA

95/100

United States - Denver, CO, United States - Los Angeles, CA - Remote, United States - New York, NY, United States - San Francisco, CAUSAgreenhouse2026-07-31

LLMs NLP Retrieval Augmented Generation Agent Orchestration Deep Learning Reinforcement Learning PyTorch TensorFlow Hugging Face MLOps

Why this is a real AI job: The role is explicitly focused on building and scaling an AI platform, including core components like agent orchestration, RAG infrastructure, model serving, and evaluation tooling. The job description heavily emphasizes AI/ML expertise and production deploym…

About Gusto At Gusto, we're on a mission to grow the small business economy. We handle the hard stuff — payroll, health insurance, 401(k)s, and HR — so owners can focus on their craft and their customers. With teams in Denver, San Francisco, and New York, we support more than 500,000 small business…

Details Open source / apply

Senior Engineering Manager, Machine Learning Platform

Affirm · Remote US

95/100

Remote USUSAgreenhouse2026-07-31

Machine Learning MLOps Deep Neural Networks Transformer Architectures Reinforcement Learning GPU Computing Feature Stores Model Serving Data Pipelines

Why this is a real AI job: The role is explicitly focused on leading the engineering team building and operating the core ML platform at Affirm. The description details deep involvement with all aspects of the ML lifecycle, from feature computation to model serving, including advanced…

Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest. We are seeking a Senior Engineering Manager to lead our ML Training & Serving team. This is a senior technical leadership role…

Details Open source / apply

Senior Data Scientist, Growth Alliance

HelloFresh · Warszawa, Masovian Voivodeship, Poland

95/100

Warszawa, Masovian Voivodeship, PolandGermanygreenhouse2026-07-31

Machine Learning Reinforcement Learning Contextual Bandits Data Science MLOps Python Spark MLflow Causal Inference Experimentation

Why this is a real AI job: The role explicitly focuses on building and deploying reinforcement learning models (contextual bandits) for optimizing communication channels and personalizing user experiences. The job description details tasks like model building, experimentation, producti…

Work with HelloFresh in Warsaw and its HelloTech organisation, HelloFresh’s global technology backbone with more than 1000 people, building the digital products that power our end-to-end food experience. From meal kits and ready-to-eat meals to specialty offerings like pet food and premium meat & s…

Details Open source / apply

Principal Machine Learning Engineer- LLM Fine-tuning and Optimization

Airbnb · United States

95/100

United StatesUSAgreenhouse2026-07-31

Machine Learning LLM Fine-tuning Optimization PyTorch Data Science MLOps RAG NLP Generative AI

Why this is a real AI job: The role explicitly focuses on fine-tuning and optimizing LLMs, building AI products, and driving AI initiatives. The responsibilities and qualifications are heavily centered around AI/ML technologies.

Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home, and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every country across the globe. Every day, hosts offer unique stays and experiences that make it possible f…

Details Open source / apply

Senior Staff Machine Learning Engineer, Post Training

Airbnb · United States

95/100

United StatesUSAgreenhouse2026-07-31

Machine Learning LLM Data Science PyTorch MLOps NLP Generative AI Reinforcement Learning Model Optimization Data Processing

Why this is a real AI job: The role is explicitly focused on developing and deploying LLMs, ML models, and AI products. The description details tasks such as fine-tuning LLMs, building AI-powered features, and contributing to the ML platform architecture. The required expertise heavily…

Details Open source / apply

Senior Machine Learning Engineer, Price Modeling

Airbnb · United States

95/100

United StatesUSAgreenhouse2026-07-31

Machine Learning Reinforcement Learning Python Data Analysis Modeling

Why this is a real AI job: The role explicitly focuses on developing and refining machine learning models (specifically reinforcement learning) for pricing recommendations. This is a core AI function.

Details Open source / apply

Machine Learning Engineer, AV Engineering

Wayve · Herzliya, Israel

95/100

Herzliya, IsraelUKgreenhouse2026-07-31

Machine Learning Deep Learning Data Mining Data Curation Model Training Model Evaluation Reinforcement Learning Transformer Networks PyTorch MLOps

Why this is a real AI job: The role explicitly focuses on developing and training AI models for autonomous driving (L2-L4), owning the entire ML lifecycle, and deploying these models into vehicles. The core responsibilities are heavily centered around AI/ML tasks.

About us Founded in 2017, Wayve is the leading developer of Embodied AI technology. Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems. Our vision is to create…

Details Open source / apply

Applied Scientist / Machine Learning Engineer

Wayve · Sunnyvale, California USA

95/100

Sunnyvale, California USAUKgreenhouse2026-07-31

Machine Learning Data Science Deep Learning Foundation Models LLMs Computer Vision NLP Reinforcement Learning Data Curation MLOps

Why this is a real AI job: The role is explicitly focused on building and improving foundation models for autonomous driving. The core responsibilities revolve around data curation, enrichment, model training, evaluation, and deployment – all central to AI/ML work.

Details Open source / apply

Applied Machine Learning Research Scientist

Cerebras Systems · US and Canada Offices

95/100

US and Canada OfficesUSAFullTimeashby2026-07-31

Machine Learning Deep Learning Large Language Models LLMs PyTorch Reinforcement Learning Data Pipelines Model Optimization MLOps

Why this is a real AI job: The role explicitly focuses on applying and improving machine learning techniques (specifically LLMs) at scale. Responsibilities center around building ML pipelines, optimizing models, and working with large datasets – all core AI activities.

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. This architecture allows Cerebras to deliver industry-leading training and inference speeds; over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transfor…

Details Open source / apply