AI Job Radar

Model Evaluation Jobs – Page 2

Aktuelle KI-Jobs mit Model Evaluation, passende Lernpfade und Bewerbungsbezug.

How to use Model Evaluation in applications

If a job requires Model Evaluation, the skill should be supported by a project, course or portfolio example. The application check reviews whether the skill is actually evidenced in your CV.

39
Results
18
Companies
94.2
Average score
11
Remote

10 results on this page. 39 results in total. More results are available via pagination, company pages, skill pages and job detail pages.

Machine Learning Engineer, Enterprise Brain

Glean · Mountain View, CA (HQ), San Francisco, CA

95/100
Mountain View, CA (HQ), San Francisco, CAUSAgreenhouse2026-06-16

Why this is a real AI job: The role explicitly focuses on building and improving AI-powered products (Enterprise Brain) using LLMs, ML techniques, and agent orchestration. The tasks are heavily centered around core AI/ML engineering principles.

About Glean: Glean is the Work AI platform that helps everyone work smarter with AI. What began as the industry’s most advanced enterprise search has evolved into a full-scale Work AI ecosystem, powering intelligent Search, an AI Assistant, and scalable AI agents on one secure, open platform. With…

Details Open source / apply

Software Engineer, GenAI

Abridge · SF Office

95/100
SF OfficeUSAFullTimeashby2026-06-16

Why this is a real AI job: The role is explicitly focused on designing, building, and evaluating LLM-powered systems for a core healthcare application. The job description heavily emphasizes LLM APIs, agentic workflows, and model evaluation, indicating a primary focus on AI/ML.

ABOUT ABRIDGE Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare. Our AI-powered platform was purpose-built for medical conversations, improving clinical documentation efficiencies while enabling clinicians to focus on what matters most—their patients. Our e…

Details Open source / apply

AI Engineer, Model Quality and Performance

Cerebras Systems · Headquarters/Sunnyvale Office

95/100
Headquarters/Sunnyvale OfficeUSAgreenhouse2026-06-16

Why this is a real AI job: The role is entirely focused on building AI-driven systems for model quality, performance evaluation, and automation. The core responsibilities revolve around AI agents, data analysis, and building tooling for ML models.

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training…

Details Open source / apply

Palo AltoFranceFull-timelever2026-06-16

Why this is a real AI job: The role explicitly focuses on deploying and integrating Mistral AI's LLM products with customers, requiring deep technical expertise in LLMs, fine-tuning, and related AI technologies. The job description heavily emphasizes AI/ML tasks.

About Mistral At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life. We democratize AI through high-performance, optimized, open-source and cutting-edge models, produ…

Details Open source / apply

Data Scientist

Mistral AI · Paris

95/100
ParisFranceFull-timelever2026-06-16

Why this is a real AI job: The role is explicitly focused on analyzing AI product performance, designing data-driven features, building and deploying ML models, and improving AI products. The required skills and experience heavily emphasize data science and machine learning techniques.

About Mistral At Mistral AI, we believe in the power of AI to simplify tasks, save time and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life. We democratize AI through high-performance, optimized, open-source and cutting-edge models, produc…

Details Open source / apply

San FranciscoUSAFullTimeashby2026-06-16

Why this is a real AI job: The role explicitly focuses on building and optimizing recommendation systems powered by LLMs, requiring deep ML expertise. The job description heavily emphasizes ML fundamentals, model building, and data foundations for learning and improvement.

Perplexity is seeking experienced ML engineers to design, build, and optimize the recommendation systems that power core experiences on Perplexity. Perplexity builds AI for those who expect more. Our products are designed to help people find answers, make their most consequential decisions, and com…

Details Open source / apply

TorontoCanadaFullTimeashby2026-06-16

Why this is a real AI job: The role is explicitly focused on researching, developing, and implementing new evaluation methods for large language models (LLMs). The core responsibilities revolve around advancing the state-of-the-art in LLM evaluation, building tools for model performanc…

Who are we? Cohere is the leading security-first enterprise AI company. We build cutting-edge foundation AI models and end-to-end products that are designed to solve real-world business problems. We’re training and deploying frontier models for enterprises who are building AI systems. We believe th…

Details Open source / apply

TorontoCanadaFullTimeashby2026-06-16

Why this is a real AI job: The role is explicitly focused on developing and improving LLM evaluation methods, benchmarks, and infrastructure. The core responsibilities revolve around pushing the state-of-the-art in LLM evaluation, building tools for analysis, and working with LLM judge…

Who are we? Cohere is the leading security-first enterprise AI company. We build cutting-edge foundation AI models and end-to-end products that are designed to solve real-world business problems. We’re training and deploying frontier models for enterprises who are building AI systems. We believe th…

Details Open source / apply

London, UKUSAgreenhouse2026-06-16

Why this is a real AI job: The role explicitly focuses on building and deploying AI applications, generating training data for LLMs, and upskilling in AI. The required experience includes applying AI/ML in production environments. The core responsibilities are heavily AI-focused.

Scale’s rapidly growing Global Public Sector team is focused on using AI to address critical challenges facing the public sector around the world. Our core work consists of: Creating custom AI applications that will impact millions of citizens Generating high-quality training data for national LLMs…

Details Open source / apply

San Francisco, CAUSAgreenhouse2026-06-16

Why this is a real AI job: The role explicitly focuses on research and development of post-training techniques for LLMs, including SFT, RLHF, and reward modeling. The job description highlights the application of these techniques to enhance LLM capabilities and solve core AI problems.

Scale works with the industry’s leading AI labs to provide high quality data and accelerate progress in GenAI research. We are looking for Research Scientists and Research Engineers with expertise in LLM post-training (SFT, RLHF, reward modeling). This role will focus on optimizing data curation an…

Details Open source / apply