AI Job Radar

Inference Jobs

Aktuelle KI-Jobs mit Inference, passende Lernpfade und Bewerbungsbezug.

How to use Inference in applications

If a job requires Inference, the skill should be supported by a project, course or portfolio example. The application check reviews whether the skill is actually evidenced in your CV.

12
Results
7
Companies
93.8
Average score
3
Remote

10 results on this page. 12 results in total. More results are available via pagination, company pages, skill pages and job detail pages.

US-CA-Menlo ParkUSAFullTimeashby2026-06-16

Why this is a real AI job: The role is explicitly focused on building and scaling a machine learning platform, specifically for inference and LLM workloads. The responsibilities and requirements heavily emphasize ML systems, frameworks, and serving infrastructure. The company is active…

At Snowflake, we are powering the era of the agentic enterprise. To usher in this new era, we seek AI-native thinkers across every function who are energized by the opportunity to reinvent how they work. You don’t just use tools; you possess an innate curiosity, treating AI as a high-trust collabor…

Details Open source / apply

Full Stack LLM Engineer

Cerebras Systems · Toronto Office

95/100
Toronto OfficeUSAgreenhouse2026-06-16

Why this is a real AI job: The role is explicitly focused on bringing up and optimizing large language models (LLMs) on specialized hardware. The responsibilities and required skills are heavily centered around AI/ML concepts and implementation.

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training…

Details Open source / apply

Toronto OfficeUSAgreenhouse2026-06-16

Why this is a real AI job: The role is explicitly focused on LLM inference performance, model evaluation, and optimization on specialized hardware. The responsibilities and required skills are deeply rooted in AI/ML concepts and techniques.

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training…

Details Open source / apply

San FranciscoUSAFullTimeashby2026-06-16

Why this is a real AI job: The role is explicitly focused on building and optimizing infrastructure for large-scale LLM inference, a core AI task. The description heavily emphasizes AI/ML technologies and their application.

About Anyscale At Anyscale https://www.anyscale.com/, we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We’re commercializing Ray https://docs.ray.io/en/latest/, a popular open-source project that's creating an ecosystem of li…

Details Open source / apply

Applied Machine Learning Engineer

Fireworks AI · New York, San Mateo

95/100
New York, San MateoUSAgreenhouse2026-06-16

Why this is a real AI job: The role explicitly focuses on developing, fine-tuning, and operationalizing machine learning models, with a strong emphasis on generative AI and LLM inference. The responsibilities are heavily centered around AI/ML engineering tasks.

About Us: At Fireworks, we’re building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry. We’ve been independently benchmarked as the leader in LLM inference speed and are driving cutting-edge in…

Details Open source / apply

San FranciscoUSAgreenhouse2026-06-16

Why this is a real AI job: The role is explicitly focused on building and optimizing AI/ML systems, specifically around inference and reinforcement learning for large language models. The responsibilities and requirements heavily emphasize core AI/ML skills and concepts.

About the Role The Turbo team sits at the intersection of efficient inference (algorithms, architectures, engines) and post‑training / RL systems. We build and operate the systems behind Together’s API, including high‑performance inference and RL/post‑training engines that can run at production sca…

Details Open source / apply

RemoteUSAgreenhouse2026-06-16

Why this is a real AI job: The role is entirely focused on the development and optimization of LLM inference frameworks, distributed systems, and related technologies. The responsibilities and requirements clearly indicate a core AI/ML engineering position.

About the Role At Together.ai, we are building state-of-the-art infrastructure to enable efficient and scalable inference for large language models (LLMs). Our mission is to optimize inference frameworks, algorithms, and infrastructure, pushing the boundaries of performance, scalability, and cost-e…

Details Open source / apply

Machine Learning Engineer

Together AI · San Francisco

95/100
San FranciscoUSAgreenhouse2026-06-16

Why this is a real AI job: The role explicitly focuses on developing systems for LLM inference and fine-tuning, requiring deep expertise in ML and related technologies. The company is a research-driven AI company.

About the Role Together AI is looking for an ML Engineer who will develop systems and APIs that enable our customers to perform inference and fine tune LLMs. Relevant experience includes implementing runtime systems that perform inference at scale using AI/ML models from simple models up to the lar…

Details Open source / apply

ParisFranceFull-timelever2026-06-16

Why this is a real AI job: The role explicitly focuses on deploying and scaling AI products, working with customers on AI solutions, and contributing to open-source AI codebases. The job description heavily emphasizes AI/ML technologies and their application in production environments.

About Mistral At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life. We democratize AI through high-performance, optimized, open-source and cutting-edge models, produ…

Details Open source / apply

AI Product Engineer

Fireworks AI · New York, San Mateo

90/100
New York, San MateoUSAgreenhouse2026-06-16

Why this is a real AI job: The role is deeply embedded in building and improving a generative AI platform, focusing on core components like inference, fine-tuning, and model deployment. The job description explicitly mentions working with LLMs and AI infrastructure.

About Us: At Fireworks, we’re building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry. We’ve been independently benchmarked as the leader in LLM inference speed and are driving cutting-edge in…

Details Open source / apply