Inference Jobs | AI Job Radar

How to use Inference in applications

If a job requires Inference, the skill should be supported by a project, course or portfolio example. The application check reviews whether the skill is actually evidenced in your CV.

Document a small project or notebook
Name relevant tools and methods
Connect the evidence to the target role

Results

Companies

94.1

Average score

Remote

10 results on this page. 16 results in total. More results are available via pagination, company pages, skill pages and job detail pages.

Senior/Staff Software Engineer - Machine Learning Platform (Inference)

Snowflake AI · US-CA-Menlo Park

95/100

US-CA-Menlo ParkUSAFullTimeashby2026-07-31

Why this is a real AI job: The role is explicitly focused on building and scaling a machine learning platform, specifically for inference and LLM workloads. The responsibilities and requirements heavily emphasize ML systems, frameworks, and serving infrastructure. The company is active…

At Snowflake, we are powering the era of the agentic enterprise. To usher in this new era, we seek AI-native thinkers across every function who are energized by the opportunity to reinvent how they work. You don’t just use tools; you possess an innate curiosity, treating AI as a high-trust collabor…

Details Open source / apply

Distributed LLM Inference Engineer

Anyscale · San Francisco

95/100

San FranciscoUSAFullTimeashby2026-07-31

LLM Machine Learning Deep Learning PyTorch MLOps Distributed Systems Inference vLLM TensorRT-LLM CUDA

Why this is a real AI job: The role is explicitly focused on building and optimizing infrastructure for large-scale LLM inference, a core AI task. The description heavily emphasizes AI/ML technologies and their application.

About Anyscale At Anyscale https://www.anyscale.com/, we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We’re commercializing Ray https://docs.ray.io/en/latest/, a popular open-source project that's creating an ecosystem of li…

Details Open source / apply

Software Engineer, LLM Infrastructure

Fireworks AI · San Mateo

95/100

San MateoUSAgreenhouse2026-07-31

LLM MLOps Machine Learning Python Kubernetes PyTorch Inference Data Pipelines

Why this is a real AI job: The role is explicitly focused on building and maintaining the infrastructure for a generative AI platform, specifically LLMs. The responsibilities directly involve ML systems, model serving, and performance optimization of AI models.

About Us: Fireworks is the platform for specialized intelligence, enabling companies to build, train, and serve AI models tailored to their own data, workflows, and products. Founded by the team behind PyTorch and backed by AMD, Atreides, Benchmark Capital, Index Ventures, Lightspeed, NVIDIA, Sequo…

Details Open source / apply

Applied Machine Learning Engineer

Fireworks AI · New York, San Mateo

95/100

New York, San MateoUSAgreenhouse2026-07-31

Machine Learning LLM Generative AI Python MLOps Fine-tuning SFT RLHF Model Deployment Inference

Why this is a real AI job: The role explicitly focuses on developing, fine-tuning, and operationalizing machine learning models, with a strong emphasis on generative AI and LLM inference. The responsibilities are heavily centered around AI/ML engineering tasks.

Details Open source / apply

AI Researcher, Core ML (Turbo)

Together AI · San Francisco

95/100

San FranciscoUSAgreenhouse2026-07-31

Machine Learning Deep Learning LLM RLHF RLAIF GRPO DPO Inference Model Optimization GPU Optimization

Why this is a real AI job: The role is explicitly focused on building and optimizing AI/ML systems, specifically around inference and reinforcement learning for large language models. The responsibilities and requirements heavily emphasize core AI/ML skills and concepts.

About the Role The Turbo team sits at the intersection of efficient inference (algorithms, architectures, engines) and post‑training / RL systems. We build and operate the systems behind Together’s API, including high‑performance inference and RL/post‑training engines that can run at production sca…

Details Open source / apply

LLM Inference Frameworks and Optimization Engineer

Together AI · Remote

95/100

RemoteUSAgreenhouse2026-07-31

LLM Large Language Models Deep Learning Inference CUDA TensorRT PyTorch GPU Optimization Distributed Systems Machine Learning

Why this is a real AI job: The role is entirely focused on the development and optimization of LLM inference frameworks, distributed systems, and related technologies. The responsibilities and requirements clearly indicate a core AI/ML engineering position.

About the Role At Together.ai, we are building state-of-the-art infrastructure to enable efficient and scalable inference for large language models (LLMs). Our mission is to optimize inference frameworks, algorithms, and infrastructure, pushing the boundaries of performance, scalability, and cost-e…

Details Open source / apply

Data Scientist, Algorithms, Optimization - Fulfillment

Lyft · New York Office, Seattle Office

95/100

New York Office, Seattle OfficeUSAgreenhouse2026-07-23

Machine Learning Data Analysis Statistical Modeling Optimization Python Data Science MLOps Experimentation Inference

Why this is a real AI job: The role explicitly focuses on developing and implementing machine learning models and algorithms for core Lyft products. Responsibilities heavily emphasize data science tasks like model building, experimentation, and analysis. The required qualifications als…

At Lyft, our purpose is to serve and connect. We aim to achieve this by cultivating a work environment where all team members belong and have the opportunity to thrive. Data Science is central to Lyft's products and decision-making. As a Data Scientist on the cross-functional team, you will work in…

Details Open source / apply

Applied AI Engineer, ML Infrastructure Engineer / Devops - EMEA

Mistral AI · Paris

95/100

ParisFranceFull-timelever2026-07-10

Machine Learning AI Python PyTorch TensorFlow LLM MLOps Data Science Inference Fine-tuning

Why this is a real AI job: The role explicitly focuses on deploying and scaling AI products, working with customers on AI solutions, and contributing to open-source AI codebases. The job description heavily emphasizes AI/ML technologies and their application in production environments.

About Mistral At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life. We democratize AI through high-performance, optimized, open-source and cutting-edge models, produ…

Details Open source / apply

Senior/Staff Applied AI Engineer, Devops/SRE

Mistral AI · Seoul

95/100

SeoulFrancelever2026-07-10

Python MLOps PyTorch TensorFlow LLM Deployment Inference Fine-tuning

Why this is a real AI job: The role explicitly focuses on deploying and integrating Mistral AI's products (LLMs) with customers, involving end-to-end execution of AI solutions. The job description heavily emphasizes working directly with AI models, infrastructure for AI, and solving co…

Details Open source / apply

Machine Learning Engineer

Together AI · San Francisco

95/100

San FranciscoUSAgreenhouse2026-07-01

Machine Learning LLM Inference Python Data Science MLOps Model Training Distributed Systems

Why this is a real AI job: The role explicitly focuses on developing systems for LLM inference and fine-tuning, requiring deep expertise in ML and related technologies. The company is a research-driven AI company.

About the Role Together AI is looking for an ML Engineer who will develop systems and APIs that enable our customers to perform inference and fine tune LLMs. Relevant experience includes implementing runtime systems that perform inference at scale using AI/ML models from simple models up to the lar…

Details Open source / apply